Eisenberg I, Eran A, Nishino I, Moggio M, Lamperti C, Amato AA, Lidov HG, Kang PB, North KN, Mitrani-Rosenbaum S, et al. Distinctive patterns of microRNA expression in primary muscular disorders. Proc Natl Acad Sci U S AProc Natl Acad Sci U S A. 2007;104 :17016-21.Abstract
The primary muscle disorders are a diverse group of diseases caused by various defective structural proteins, abnormal signaling molecules, enzymes and proteins involved in posttranslational modifications, and other mechanisms. Although there is increasing clarification of the primary aberrant cellular processes responsible for these conditions, the decisive factors involved in the secondary pathogenic cascades are still mainly obscure. Given the emerging roles of microRNAs (miRNAs) in modulation of cellular phenotypes, we searched for miRNAs regulated during the degenerative process of muscle to gain insight into the specific regulation of genes that are disrupted in pathological muscle conditions. We describe 185 miRNAs that are up- or down-regulated in 10 major muscular disorders in humans [Duchenne muscular dystrophy (DMD), Becker muscular dystrophy, facioscapulohumeral muscular dystrophy, limb-girdle muscular dystrophies types 2A and 2B, Miyoshi myopathy, nemaline myopathy, polymyositis, dermatomyositis, and inclusion body myositis]. Although five miRNAs were found to be consistently regulated in almost all samples analyzed, pointing to possible involvement of a common regulatory mechanism, others were dysregulated only in one disease and not at all in the other disorders. Functional correlation between the predicted targets of these miRNAs and mRNA expression demonstrated tight posttranscriptional regulation at the mRNA level in DMD and Miyoshi myopathy. Together with direct mRNA-miRNA predicted interactions demonstrated in DMD, some of which are involved in known secondary response functions and others that are involved in muscle regeneration, these findings suggest an important role of miRNAs in specific physiological pathways underlying the disease pathology.
Inaoka H, Fukuoka Y, Kohane IS. Evidence of spatially bound gene regulation in Mus musculus: decreased gene expression proximal to microRNA genomic location. Proc Natl Acad Sci U S AProc Natl Acad Sci U S A. 2007;104 :5020-5.Abstract
The extent, spatially and in time, of the phenomenon of localized decreased expression in the chromosomal vicinity of microRNA (miRNA) previously described in Caenorhabditis elegans is reproduced in Mus musculus across a wide range of tissues in several independent experiments. Computationally predicted miRNA targets are enriched in the vicinity of miRNAs, and transcription factors are identified as the class of genes that systematically exhibit this localized decrease. Also, those mRNA with AT-rich UTRs, particularly those that are not in the vicinity of CpG islands, most often exhibit this localized decrease. This localization broadens with the shift from developing to mature/differentiated tissues and suggests a developmentally controlled and spatially bound regulation.
Allocco D, Song Q, Gibbons G, Ramoni M, s. a. a. c. Kohane I. Geography and genography: prediction of continental origin using randomly selected single nucleotide polymorphisms. BMC GenomicsBMC Genomics. 2007;8 :68.
Loscalzo J, Kohane I, Barabasi AL. Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol Syst BiolMol Syst Biol. 2007;3 :124.Abstract
Contemporary classification of human disease derives from observational correlation between pathological analysis and clinical syndromes. Characterizing disease in this way established a nosology that has served clinicians well to the current time, and depends on observational skills and simple laboratory tools to define the syndromic phenotype. Yet, this time-honored diagnostic strategy has significant shortcomings that reflect both a lack of sensitivity in identifying preclinical disease, and a lack of specificity in defining disease unequivocally. In this paper, we focus on the latter limitation, viewing it as a reflection both of the different clinical presentations of many diseases (variable phenotypic expression), and of the excessive reliance on Cartesian reductionism in establishing diagnoses. The purpose of this perspective is to provide a logical basis for a new approach to classifying human disease that uses conventional reductionism and incorporates the non-reductionist approach of systems biomedicine.
Kohane IS, Mandl KD, Taylor PL, Holm IA, Nigrin DJ, Kunkel LM. Medicine. Reestablishing the researcher-patient compact. ScienceScience. 2007;316 :836-7.
Liu M, Liberzon A, Kong SW, Lai WR, Park PJ, Kohane IS, Kasif S. Network-Based Analysis of Affected Biological Processes in Type 2 Diabetes Models. PLoS GenetPLoS Genet. 2007;3 :e96.Abstract
Type 2 diabetes mellitus is a complex disorder associated with multiple genetic, epigenetic, developmental, and environmental factors. Animal models of type 2 diabetes differ based on diet, drug treatment, and gene knockouts, and yet all display the clinical hallmarks of hyperglycemia and insulin resistance in peripheral tissue. The recent advances in gene-expression microarray technologies present an unprecedented opportunity to study type 2 diabetes mellitus at a genome-wide scale and across different models. To date, a key challenge has been to identify the biological processes or signaling pathways that play significant roles in the disorder. Here, using a network-based analysis methodology, we identified two sets of genes, associated with insulin signaling and a network of nuclear receptors, which are recurrent in a statistically significant number of diabetes and insulin resistance models and transcriptionally altered across diverse tissue types. We additionally identified a network of protein-protein interactions between members from the two gene sets that may facilitate signaling between them. Taken together, the results illustrate the benefits of integrating high-throughput microarray studies, together with protein-protein interaction networks, in elucidating the underlying biological processes associated with a complex disorder.
Drake TA, Braun J, Marchevsky A, Kohane IS, Fletcher C, Chueh H, Beckwith B, Berkowicz D, Kuo F, Zeng QT, et al. A system for sharing routine surgical pathology specimens across institutions: the Shared Pathology Informatics Network. Hum PatholHum Pathol. 2007.Abstract
This report presents an overview for pathologists of the development and potential applications of a novel web enabled system allowing indexing and retrieval of pathology specimens across multiple institutions. The system was developed through the National Cancer Institute's Shared Pathology Informatics Network program with the goal of creating a prototype system to find existing pathology specimens derived from routine surgical and autopsy procedures ("paraffin blocks") that may be relevant to cancer research. To reach this goal, a number of challenges needed to be met. A central aspect was the development of an informatics system that supported Web-based searching while retaining local control of data. Additional aspects included the development of an eXtensible Markup Language schema, representation of tissue specimen annotation, methods for deidentifying pathology reports, tools for autocoding critical data from these reports using the Unified Medical Language System, and hierarchies of confidentiality and consent that met or exceeded federal requirements. The prototype system supported Web-based querying of millions of pathology reports from 6 participating institutions across the country in a matter of seconds to minutes and the ability of bona fide researchers to identify and potentially to request specific paraffin blocks from the participating institutions. With the addition of associated clinical and outcome information, this system could vastly expand the pool of annotated tissues available for cancer research as well as other diseases.
Brownstein JS, Sordo M, Kohane IS, Mandl KD. The tell-tale heart: population-based surveillance reveals an association of rofecoxib and celecoxib with myocardial infarction. PLoS ONEPLoS ONEPLoS ONE. 2007;2 :e840.Abstract
BACKGROUND: COX-2 selective inhibitors are associated with myocardial infarction (MI). We sought to determine whether population health monitoring would have revealed the effect of COX-2 inhibitors on population-level patterns of MI. METHODOLOGY/PRINCIPAL FINDINGS: We conducted a retrospective study of inpatients at two Boston hospitals, from January 1997 to March 2006. There was a population-level rise in the rate of MI that reached 52.0 MI-related hospitalizations per 100,000 (a two standard deviation exceedence) in January of 2000, eight months after the introduction of rofecoxib and one year after celecoxib. The exceedence vanished within one month of the withdrawal of rofecoxib. Trends in inpatient stay due to MI were tightly coupled to the rise and fall of prescriptions of COX-2 inhibitors, with an 18.5% increase in inpatient stays for MI when both rofecoxib and celecoxib were on the market (P<0.001). For every million prescriptions of rofecoxib and celecoxib, there was a 0.5% increase in MI (95%CI 0.1 to 0.9) explaining 50.3% of the deviance in yearly variation of MI-related hospitalizations. There was a negative association between mean age at MI and volume of prescriptions for celecoxib and rofecoxib (Spearman correlation, -0.67, P<0.05). CONCLUSIONS/SIGNIFICANCE: The strong relationship between prescribing and outcome time series supports a population-level impact of COX-2 inhibitors on MI incidence. Further, mean age at MI appears to have been lowered by use of these medications. Use of a population monitoring approach as an adjunct to pharmacovigilence methods might have helped confirm the suspected association, providing earlier support for the market withdrawal of rofecoxib.
Lee JM, Ivanova EV, Seong IS, Cashorali T, Kohane I, Gusella JF, Macdonald ME. Unbiased Gene Expression Analysis Implicates the huntingtin Polyglutamine Tract in Extra-mitochondrial Energy Metabolism. PLoS GenetPLoS Genet. 2007;3 :e135.Abstract
The Huntington's disease (HD) CAG repeat, encoding a polymorphic glutamine tract in huntingtin, is inversely correlated with cellular energy level, with alleles over approximately 37 repeats leading to the loss of striatal neurons. This early HD neuronal specificity can be modeled by respiratory chain inhibitor 3-nitropropionic acid (3-NP) and, like 3-NP, mutant huntingtin has been proposed to directly influence the mitochondrion, via interaction or decreased PGC-1alpha expression. We have tested this hypothesis by comparing the gene expression changes due to mutant huntingtin accurately expressed in STHdh(Q111/Q111) cells with the changes produced by 3-NP treatment of wild-type striatal cells. In general, the HD mutation did not mimic 3-NP, although both produced a state of energy collapse that was mildly alleviated by the PGC-1alpha-coregulated nuclear respiratory factor 1 (Nrf-1). Moreover, unlike 3-NP, the HD CAG repeat did not significantly alter mitochondrial pathways in STHdh(Q111/Q111) cells, despite decreased Ppargc1a expression. Instead, the HD mutation enriched for processes linked to huntingtin normal function and Nf-kappaB signaling. Thus, rather than a direct impact on the mitochondrion, the polyglutamine tract may modulate some aspect of huntingtin's activity in extra-mitochondrial energy metabolism. Elucidation of this HD CAG-dependent pathway would spur efforts to achieve energy-based therapeutics in HD.
Turchin A, Guo CZ, Adler GK, Ricchiuti V, Kohane IS, Williams GH. Effect of acute aldosterone administration on gene expression profile in the heart. EndocrinologyEndocrinology. 2006;147 :3183-9.Abstract
Aldosterone is known to have a number of direct adverse effects on the heart, including fibrosis and myocardial inflammation. However, genetic mechanisms of aldosterone action on the heart remain unclear. This paper describes an investigation of temporal changes in gene expression profile of the whole heart induced by acute administration of a physiologic dose of aldosterone in the mouse. mRNA levels of 34,000 known mouse genes were measured at eight time points after aldosterone administration using oligonucleotide microarrays and compared with those of the control animals who underwent a sham injection. A novel software tool (CAGED) designed for analysis of temporal microarray experiments using a Bayesian approach was used to identify genes differentially expressed between the aldosterone-injected and control group. CAGED analysis identified 12 genes as having significant differences in their temporal profiles between aldosterone-injected and control groups. All of these genes exhibited a decrease in expression level 1-3 h after aldosterone injection followed by a brief rebound and a return to baseline. These findings were validated by quantitative RT-PCR. The differentially expressed genes included phosphatases, regulators of steroid biosynthesis, inactivators of reactive oxygen species, and structural proteins. Several of these genes are known to functionally mediate biochemical phenomena previously observed to be triggered by aldosterone administration, such as phosphorylation of ERK1/2. These results provide the first description of cardiac genetic response to aldosterone and identify several potential mediators of known biochemical sequelae of aldosterone administration in the heart.
Saxena V, Orgill D, Kohane I. Absolute enrichment: gene set enrichment analysis for homeostatic systems. Nucleic Acids ResNucleic Acids Res. 2006;34 :e151.Abstract
The Gene Set Enrichment Analysis (GSEA) identifies sets of genes that are differentially regulated in one direction. Many homeostatic systems will include one limb that is upregulated in response to a downregulation of another limb and vice versa. Such patterns are poorly captured by the standard formulation of GSEA. We describe a technique to identify groups of genes (which sometimes can be pathways) that include both up- and down-regulated components. This approach lends insights into the feedback mechanisms that may operate, especially when integrated with protein interaction databases.
Sebastiani P, Mandl KD, Szolovits P, Kohane IS, Ramoni MF. A Bayesian dynamic model for influenza surveillance. Stat MedStat Med. 2006;25 :1823-5.
Liu F, Park PJ, Lai W, Maher E, Chakravarti A, Durso L, Jiang X, Yu Y, Brosius A, Thomas M, et al. A genome-wide screen reveals functional gene clusters in the cancer genome and identifies EphA2 as a mitogen in glioblastoma. Cancer ResCancer Res. 2006;66 :10815-23.Abstract
A novel genome-wide screen that combines patient outcome analysis with array comparative genomic hybridization and mRNA expression profiling was developed to identify genes with copy number alterations, aberrant mRNA expression, and relevance to survival in glioblastoma. The method led to the discovery of physical gene clusters within the cancer genome with boundaries defined by physical proximity, correlated mRNA expression patterns, and survival relatedness. These boundaries delineate a novel genomic interval called the functional common region (FCR). Many FCRs contained genes of high biological relevance to cancer and were used to pinpoint functionally significant DNA alterations that were too small or infrequent to be reliably identified using standard algorithms. One such FCR contained the EphA2 receptor tyrosine kinase. Validation experiments showed that EphA2 mRNA overexpression correlated inversely with patient survival in a panel of 21 glioblastomas, and ligand-mediated EphA2 receptor activation increased glioblastoma proliferation and tumor growth via a mitogen-activated protein kinase-dependent pathway. This novel genome-wide approach greatly expanded the list of target genes in glioblastoma and represents a powerful new strategy to identify the upstream determinants of tumor phenotype in a range of human cancers.
Kohane IS, Masys DR, Altman RB. The incidentalome: a threat to genomic medicine. JamaJAMA. 2006;296 :212-5.
Murphy SN, Mendis ME, Berkowicz DA, Kohane IS, Chueh HC. Integration of Clinical and Genetic Data in the i2b2 Architecture. AMIA Annu Symp ProcAMIA Annu Symp Proc. 2006 :1040.Abstract
The Informatics for Integrating Biology and the Bedside (i2b2) is one of the sponsored initiatives of the NIH Roadmap National Centers for Biomedical Computing ( One of the goals of i2b2 is to provide clinical investigators broadly with the software tools necessary to collect and manage project-related clinical research data in the genomics age as a cohesive entity - a software suite to construct and manage the modern clinical research chart.
Simons W, Halamka J, Kohane IS, Nigrin D, Finstein N, Mandl KD. Integration of the Personally Controlled Electronic Medical Record into Regional Inter-regional Data Exchanges: A National Demonstration. AMIA Annu Symp ProcAMIA Annu Symp Proc. 2006 :1099.Abstract
We present the approach taken in a Massachusetts-based national demonstration project to integrate the PING1 personally controlled health record (PCHR) with the MA-SHARE2 network, the state-wide inter-organizational data exchange. We describe how we have created a patient-controlled gateway to the network, and how PCHRs have become a first class data source in the network.
Inaoka H, Fukuoka Y, Kohane IS. Lower expression of genes near microRNA in C. elegans germline. BMC BioinformaticsBMC Bioinformatics. 2006;7 :112.Abstract
ABSTRACT: BACKGROUND: MicroRNAs (miRNAs) are recently discovered short non-protein-coding RNA molecules. miRNAs are increasingly implicated in tissue-specific transcriptional control and particularly in development. Because there is mounting evidence for the localized component of transcriptional control, we investigated if there is a distance-dependent effect of miRNA. RESULTS: We analyzed gene expression levels around the 84 of 113 know miRNAs for which there are nearby gene that were measured in the data in two independent C. elegans expression data sets. The expression levels are lower for genes in the vicinity of 59 of 84 (71%) miRNAs as compared to genes far from such miRNAs. Analysis of the genes with lower expression in proximity to the miRNAs reveals increased frequency matching of the 7 nucleotide seeds of these miRNAs. CONCLUSIONS: We found decreased messenger RNA (mRNA) abundance, localized within a 10 kb of chromosomal distance of some miRNAs, in C. elegans germline. The increased frequency of seed matching near miRNA can explain, in part, the localized effects.
Liu H, Kho AT, Kohane IS, Sun Y. Predicting Survival within the Lung Cancer Histopathological Hierarchy Using a Multi-Scale Genomic Model of Development. PLoS MedPLoS Med. 2006;3 :e232.Abstract
BACKGROUND: The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis-spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. METHODS AND FINDINGS: Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples) were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan-Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. CONCLUSIONS: From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome.
Marinescu VD, Kohane IS, Kim TK, Harmin DA, Greenberg ME, Riva A. START: an automated tool for serial analysis of chromatin occupancy data. BioinformaticsBioinformatics. 2006.Abstract
SUMMARY: The serial analysis of chromatin occupancy technique (SACO) promises to become a widely used method for the unbiased genome-wide experimental identification of loci bound by a transcription factor of interest. We describe the first web-based automatic tool, named START--Sequence Tag Analysis and Reporting Tool, for processing SACO data generated by experiments performed for the yeast, fruit fly, mouse, rat or human genomes. The program uses as input sequences of inserts from a SACO library from which it extracts all SACO tags, maps them to genomic locations and annotates them. START returns detailed information about these tags including the genes, the genomic elements and the miRNA precursors found in their vicinity, and makes use of the MAPPER database to identify putative transcription factor binding sites located close to the tags. AVAILABILITY: The program is available at SUPPLEMENTARY INFORMATION: is available at
Kho AT, Kang PB, Kohane IS, Kunkel LM. Transcriptome-scale similarities between mouse and human skeletal muscles with normal and myopathic phenotypes. BMC Musculoskelet DisordBMC Musculoskelet Disord. 2006;7 :23.Abstract
ABSTRACT: BACKGROUND: Mouse and human skeletal muscle transcriptome profiles vary by muscle type, raising the question of which mouse muscle groups have the greatest molecular similarities to human skeletal muscle. METHODS: Orthologous (whole, sub-) transcriptome profiles were compared among four mouse-human transcriptome datasets: (M) six muscle groups obtained from three mouse strains (wildtype, mdx, mdx5cv); (H1) biopsied human quadriceps from controls and Duchenne muscular dystrophy patients; (H2) four different control human muscle types obtained at autopsy; and (H3) 12 different control human tissues (ten non-muscle). RESULTS: Of the six mouse muscles examined, mouse soleus bore the greatest molecular similarities to human skeletal muscles, independent of the latters' anatomic location/muscle type, disease state, age and sampling method (autopsy versus biopsy). Significant similarity to any one mouse muscle group was not observed for non-muscle human tissues (dataset H3), indicating this finding to be muscle specific. CONCLUSION: This observation may be partly explained by the higher type I fiber content of soleus relative to the other mouse muscles sampled.