Publications

2014
Ananthakrishnan AN, Cheng SC, Cai T, Cagan A, Gainer VS, Szolovits P, Shaw SY, Churchill S, Karlson EW, Murphy SN, et al. Association between reduced plasma 25-hydroxy vitamin D and increased risk of cancer in patients with inflammatory bowel diseases. Clin Gastroenterol Hepatol. 2014;12 :821-7.Abstract
BACKGROUND & AIMS: Vitamin D deficiency is common among patients with inflammatory bowel diseases (IBD) (Crohn's disease or ulcerative colitis). The effects of low plasma 25-hydroxy vitamin D (25[OH]D) on outcomes other than bone health are understudied in patients with IBD. We examined the association between plasma level of 25(OH)D and risk of cancers in patients with IBD. METHODS: From a multi-institutional cohort of patients with IBD, we identified those with at least 1 measurement of plasma 25(OH)D. The primary outcome was development of any cancer. We examined the association between plasma 25(OH)D and risk of specific subtypes of cancer, adjusting for potential confounders in a multivariate regression model. RESULTS: We analyzed data from 2809 patients with IBD and a median plasma level of 25(OH)D of 26 ng/mL. Nearly one-third had deficient levels of vitamin D (<20 ng/mL). During a median follow-up period of 11 years, 196 patients (7%) developed cancer, excluding nonmelanoma skin cancer (41 cases of colorectal cancer). Patients with vitamin D deficiency had an increased risk of cancer (adjusted odds ratio, 1.82; 95% confidence interval, 1.25-2.65) compared with those with sufficient levels. Each 1-ng/mL increase in plasma 25(OH)D was associated with an 8% reduction in risk of colorectal cancer (odds ratio, 0.92; 95% confidence interval, 0.88-0.96). A weaker inverse association was also identified for lung cancer. CONCLUSIONS: In a large multi-institutional IBD cohort, a low plasma level of 25(OH)D was associated with an increased risk of cancer, especially colorectal cancer.
Blumenthal SR, Castro VM, Clements CC, Rosenfield HR, Murphy SN, Fava M, Weilburg JB, Erb JL, Churchill SE, Kohane IS, et al. An electronic health records study of long-term weight gain following antidepressant use. JAMA Psychiatry. 2014;71 :889-96.Abstract
IMPORTANCE: Short-term studies suggest antidepressants are associated with modest weight gain but little is known about longer-term effects and differences between individual medications in general clinical populations. OBJECTIVE: To estimate weight gain associated with specific antidepressants over the 12 months following initial prescription in a large and diverse clinical population. DESIGN, SETTING, AND PARTICIPANTS: We identified 22 610 adult patients who began receiving a medication of interest with available weight data in a large New England health care system, including 2 academic medical centers and affiliated outpatient primary and specialty care clinics. We used electronic health records to extract prescribing data and recorded weights for any patient with an index antidepressant prescription including amitriptyline hydrochloride, bupropion hydrochloride, citalopram hydrobromide, duloxetine hydrochloride, escitalopram oxalate, fluoxetine hydrochloride, mirtazapine, nortriptyline hydrochloride, paroxetine hydrochloride, venlafaxine hydrochloride, and sertraline hydrochloride. As measures of assay sensitivity, additional index prescriptions examined included the antiasthma medication albuterol sulfate and the antiobesity medications orlistat, phentermine hydrochloride, and sibutramine hydrochloride. Mixed-effects models were used to estimate rate of weight change over 12 months in comparison with the reference antidepressant, citalopram. MAIN OUTCOME AND MEASURE: Clinician-recorded weight at 3-month intervals up to 12 months. RESULTS: Compared with citalopram, in models adjusted for sociodemographic and clinical features, significantly decreased rate of weight gain was observed among individuals treated with bupropion (beta [SE]: -0.063 [0.027]; P = .02), amitriptyline (beta [SE]: -0.081 [0.025]; P = .001), and nortriptyline (beta [SE]: -0.147 [0.034]; P < .001). As anticipated, differences were less pronounced among individuals discontinuing treatment prior to 12 months. CONCLUSIONS AND RELEVANCE: Antidepressants differ modestly in their propensity to contribute to weight gain. Short-term investigations may be insufficient to characterize and differentiate this risk.
Ananthakrishnan AN, Cagan A, Gainer VS, Cheng SC, Cai T, Szolovits P, Shaw SY, Churchill S, Karlson EW, Murphy SN, et al. Higher plasma vitamin D is associated with reduced risk of Clostridium difficile infection in patients with inflammatory bowel diseases. Aliment Pharmacol Ther. 2014;39 :1136-42.Abstract
BACKGROUND: Patients with inflammatory bowel diseases (IBD) have an increased risk of clostridium difficile infection (CDI). Cathelicidins are anti-microbial peptides that attenuate colitis and inhibit the effect of clostridial toxins. Plasma calcifediol [25(OH)D] stimulates production of cathelicidins. AIM: To examine the association between plasma 25(OH)D and CDI in patients with IBD. METHODS: From a multi-institutional IBD cohort, we identified patients with at least one measured plasma 25(OH)D. Our primary outcome was development of CDI. Multivariate logistic regression models adjusting for potential confounders were used to identify independent effect of plasma 25(OH)D on risk of CDI. RESULTS: We studied 3188 IBD patients of whom 35 patients developed CDI. Patients with CDI-IBD were older and had greater co-morbidity. The mean plasma 25(OH)D level was significantly lower in patients who developed CDI (20.4 ng/mL) compared to non-CDI-IBD patients (27.1 ng/mL) (P = 0.002). On multivariate analysis, each 1 ng/mL increase in plasma 25(OH)D was associated with a 4% reduction in risk of CDI (OR 0.96, 95% CI 0.93-0.99, P = 0.046). Compared to individuals with vitamin D >20 ng/mL, patients with levels <20 ng/mL were more likely to develop CDI (OR 2.27, 95% CI 1.16-4.44). The mean plasma 25(OH)D in patients with CDI who subsequently died was significantly lower (12.8 +/- 8.1 ng/mL) compared to those who were alive at the end of follow-up (24.3 +/- 13.2 ng/mL) (P = 0.01). CONCLUSIONS: Higher plasma calcifediol [25(OH)D] is associated with reduced risk of C. difficile infection in patients with IBD. Further studies of therapeutic supplementation of vitamin D in patients with inflammatory bowel disease and C. difficile infection may be warranted.
Brownstein CA, Beggs AH, Homer N, Merriman B, Yu TW, Flannery KC, DeChene ET, Towne MC, Savage SK, Price EN, et al. An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome BiolGenome Biol. 2014;15 :R53.Abstract
BACKGROUND: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. RESULTS: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. CONCLUSIONS: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.
Ananthakrishnan AN, Cheng A, Cagan A, Cai T, Gainer VS, Shaw SY, Churchill S, Karlson EW, Murphy SN, Kohane I, et al. Mode of Childbirth and Long-Term Outcomes in Women with Inflammatory Bowel Diseases. Dig Dis Sci. 2014.Abstract
INTRODUCTION: Inflammatory bowel diseases [IBD; Crohn's disease (CD), ulcerative colitis] often affect women in their reproductive years. Few studies have analyzed the impact of mode of childbirth on long-term IBD outcomes. METHODS: We used a multi-institutional IBD cohort to identify all women in the reproductive age-group with a diagnosis of IBD prior to pregnancy. We identified the occurrence of a new diagnosis code for perianal complications, IBD-related hospitalization and surgery, and initiation of medical therapy after either a vaginal delivery or caesarean section (CS). Cox proportional hazards models adjusting for potential confounders were used to estimate independent effect of mode of childbirth on IBD outcomes. RESULTS: Our cohort included 360 women with IBD (161 CS). Women in the CS group were likely to be older and more likely to have complicated disease behavior prior to pregnancy. During follow-up, there was no difference in the likelihood of IBD-related surgery (multivariate hazard ratio 1.75, 95 % confidence interval (CI) 0.40-7.75), IBD-related hospitalization (HR 1.39), initiation of immunomodulator therapy (HR 1.45), or anti-TNF therapy (HR 1.11). Among the 133 CD pregnancies with no prior perianal disease, we found no excess risk of subsequent new diagnosis perianal fistulae with vaginal delivery compared to CS (HR 0.19, 95 % CI 0.04-1.05). CONCLUSIONS: Mode of delivery did not influence natural history of IBD. In our cohort, vaginal delivery was not associated with increased risk of subsequent perianal disease in women with CD.
Ananthakrishnan AN, Cagan A, Gainer VS, Cheng SC, Cai T, Szolovits P, Shaw SY, Churchill S, Karlson EW, Murphy SN, et al. Mortality and extraintestinal cancers in patients with primary sclerosing cholangitis and inflammatory bowel disease. J Crohns Colitis. 2014;8 :956-63.Abstract
INTRODUCTION: Primary sclerosing cholangitis (PSC) and inflammatory bowel disease (IBD) frequently co-occur. PSC is associated with increased risk for colorectal cancer (CRC). However, whether PSC is associated with increased risk of extraintestinal cancers or affects mortality in an IBD cohort has not been examined previously. METHODS: In a multi-institutional IBD cohort of IBD, we established a diagnosis of PSC using a novel algorithm incorporating narrative and codified data with high positive and negative predictive value. Our primary outcome was occurrence of extraintestinal and digestive tract cancers. Mortality was determined through monthly linkage to the social security master death index. RESULTS: In our cohort of 5506 patients with CD and 5522 patients with UC, a diagnosis of PSC was established in 224 patients (2%). Patients with IBD-PSC were younger and more likely to be male compared to IBD patients without PSC; three-quarters had UC. IBD-PSC patients had significantly increased overall risk of cancers compared to patients without PSC (OR 4.36, 95% CI 2.99-6.37). Analysis of specific cancer types revealed that a statistically significant excess risk for digestive tract cancer (OR 10.40, 95% CI 6.86-15.76), pancreatic cancer (OR 11.22, 95% CI 4.11-30.62), colorectal cancer (OR 5.00, 95% CI 2.80-8.95), and cholangiocarcinoma (OR 55.31, 95% CI 22.20-137.80) but not for other solid organ or hematologic malignancies. CONCLUSIONS: PSC is associated with increased risk of colorectal and pancreatobiliary cancer but not with excess risk of other solid organ cancers.
Clements CC, Castro VM, Blumenthal SR, Rosenfield HR, Murphy SN, Fava M, Erb JL, Churchill SE, Kaimal AJ, Doyle AE, et al. Prenatal antidepressant exposure is associated with risk for attention-deficit hyperactivity disorder but not autism spectrum disorder in a large health system. Mol PsychiatryMolecular psychiatryMolecular psychiatry. 2014.Abstract
Previous studies suggested that risk for Autism Spectrum Disorder (ASD) may be increased in children exposed to antidepressants during the prenatal period. The disease specificity of this risk has not been addressed and the possibility of confounding has not been excluded. Children with ASD or attention-deficit hyperactivity disorder (ADHD) delivered in a large New England health-care system were identified from electronic health records (EHR), and each diagnostic group was matched 1:3 with children without ASD or ADHD. All children were linked with maternal health data using birth certificates and EHRs to determine prenatal medication exposures. Multiple logistic regression was used to examine association between prenatal antidepressant exposures and ASD or ADHD risk. A total of 1377 children diagnosed with ASD and 2243 with ADHD were matched with healthy controls. In models adjusted for sociodemographic features, antidepressant exposure prior to and during pregnancy was associated with ASD risk, but risk associated with exposure during pregnancy was no longer significant after controlling for maternal major depression (odds ratio (OR) 1.10 (0.70-1.70)). Conversely, antidepressant exposure during but not prior to pregnancy was associated with ADHD risk, even after adjustment for maternal depression (OR 1.81 (1.22-2.70)). These results suggest that the risk of autism observed with prenatal antidepressant exposure is likely confounded by severity of maternal illness, but further indicate that such exposure may still be associated with ADHD risk. This risk, modest in absolute terms, may still be a result of residual confounding and must be balanced against the substantial consequences of untreated maternal depression.Molecular Psychiatry advance online publication, 26 August 2014; doi:10.1038/mp.2014.90.
Ananthakrishnan AN, Cheng SC, Cai T, Cagan A, Gainer VS, Szolovits P, Shaw SY, Churchill S, Karlson EW, Murphy SN, et al. Serum inflammatory markers and risk of colorectal cancer in patients with inflammatory bowel diseases. Clin Gastroenterol Hepatol. 2014;12 :1342-8 e1.Abstract
BACKGROUND & AIMS: Patients with inflammatory bowel diseases (IBDs) (Crohn's disease, ulcerative colitis) are at increased risk of colorectal cancer (CRC). Persistent inflammation is hypothesized to increase risk of CRC in patients with IBD; however, the few studies in this area have been restricted to cross-sectional assessments of histologic severity. No prior studies have examined association between C-reactive protein (CRP) or erythrocyte sedimentation rate (ESR) elevation and risk of CRC in an IBD cohort. METHODS: From a multi-institutional validated IBD cohort, we identified all patients with at least one measured CRP or ESR value. Patients were stratified into quartiles of severity of inflammation on the basis of their median CRP or ESR value, and subsequent diagnosis of CRC was ascertained. Logistic regression adjusting for potential confounders was used to identify the independent association between CRP or ESR elevation and risk of CRC. RESULTS: Our study included 3145 patients with at least 1 CRP value (CRP cohort) and 4008 with at least 1 ESR value (ESR cohort). Thirty-three patients in the CRP cohort and 102 patients in the ESR cohort developed CRC during a median follow-up of 5 years at a median age of 55 years. On multivariate analysis, there was a significant increase in risk of CRC across quartiles of CRP elevation (P(trend) = .017; odds ratio for quartile 4 vs quartile 1, 2.72; 95% confidence interval, 0.95-7.76). Similarly higher median ESR was also independently associated with risk of CRC across the quartiles (odds ratio, 2.06; 95% confidence interval, 1.14-3.74) (P(trend) = .007). CONCLUSIONS: An elevated CRP or ESR is associated with increased risk of CRC in patients with IBD.
Castro VM, McCoy TH, Cagan A, Rosenfield HR, Murphy SN, Churchill SE, Kohane IS, Perlis RH. Stratification of risk for hospital admissions for injury related to fall: cohort study. BMJBMJBMJ. 2014;349 :g5863.Abstract
OBJECTIVE: To determine whether the ability to stratify an individual patient's hazard for falling could facilitate development of focused interventions aimed at reducing these adverse outcomes. DESIGN: Clinical and sociodemographic data from electronic health records were utilized to derive multiple logistic regression models of hospital readmissions for injuries related to falls. Drugs used at admission were summarized based on reported adverse effect frequencies in published drug labeling. SETTING: Two large academic medical centers in New England, United States. PARTICIPANTS: The model was developed with 25 924 individuals age >/=40 with an initial hospital discharge. The resulting model was then tested in an independent set of 13 032 inpatients drawn from the same hospital and 36 588 individuals discharged from a second large hospital during the same period. MAIN OUTCOME MEASURE: Hospital readmissions for injury related to falls. RESULTS: Among 25 924 discharged individuals, 680 (2.6%) were evaluated in the emergency department or admitted to hospital for a fall within 30 days of discharge, 1635 (6.3%) within 180 days of discharge, 2360 (9.1%) within one year, and 3465 (13.4%) within two years. Older age, female sex, white or African-American race, public insurance, greater number of drugs taken on discharge, and score for burden of adverse effects were each independently associated with hazard for fall. For drug burden, presence of a drug with a frequency of adverse effects related to fall of 10% was associated with 3.5% increase in odds of falling over the next two years (odds ratio 1.04, 95% confidence interval 1.02 to 1.05). In an independent testing set, the area under the receiver operating characteristics curve was 0.65 for a fall within two years based on cross sectional data and 0.72 with the addition of prior utilization data including age adjusted Charlson comorbidity index. Portability was promising, with area under the curve of 0.71 for the longitudinal model in a second hospital system. CONCLUSIONS: It is potentially useful to stratify risk of falls based on clinical features available as artifacts of routine clinical care. A web based tool can be used to calculate and visualize risk associated with drug treatment to facilitate further investigation and application.
Ananthakrishnan AN, Cagan A, Gainer VS, Cheng SC, Cai T, Scoville E, Konijeti GG, Szolovits P, Shaw SY, Churchill S, et al. Thromboprophylaxis Is Associated With Reduced Post-hospitalization Venous Thromboembolic Events in Patients With Inflammatory Bowel Diseases. Clin Gastroenterol Hepatol. 2014.Abstract
BACKGROUND & AIMS: Patients with inflammatory bowel diseases (IBDs) have increased risk for venous thromboembolism (VTE); those who require hospitalization have particularly high risk. Few hospitalized patients with IBD receive thromboprophylaxis. We analyzed the frequency of VTE after IBD-related hospitalization, risk factors for post-hospitalization VTE, and the efficacy of prophylaxis in preventing post-hospitalization VTE. METHODS: In a retrospective study, we analyzed data from a multi-institutional cohort of patients with Crohn's disease or ulcerative colitis and at least 1 IBD-related hospitalization. Our primary outcome was a VTE event. All patients contributed person-time from the date of the index hospitalization to development of VTE, subsequent hospitalization, or end of follow-up. Our main predictor variable was pharmacologic thromboprophylaxis. Cox proportional hazard models adjusting for potential confounders were used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). RESULTS: From a cohort of 2788 patients with at least 1 IBD-related hospitalization, 62 patients developed VTE after discharge (2%). Incidences of VTE at 30, 60, 90, and 180 days after the index hospitalization were 3.7/1000, 4.1/1000, 5.4/1000, and 9.4/1000 person-days, respectively. Pharmacologic thromboprophylaxis during the index hospital stay was associated with a significantly lower risk of post-hospitalization VTE (HR, 0.46; 95% CI, 0.22-0.97). Increased numbers of comorbidities (HR, 1.30; 95% CI, 1.16-1.47) and need for corticosteroids before hospitalization (HR, 1.71; 95% CI, 1.02-2.87) were also independently associated with risk of VTE. Length of hospitalization or surgery during index hospitalization was not associated with post-hospitalization VTE. CONCLUSIONS: Pharmacologic thromboprophylaxis during IBD-related hospitalization is associated with reduced risk of post-hospitalization VTE.
Ananthakrishnan AN, Cagan A, Cai T, Gainer VS, Shaw SY, Churchill S, Karlson EW, Murphy SN, Kohane I, Liao KP. Colonoscopy Is Associated With a Reduced Risk for Colon Cancer and Mortality in Patients With Inflammatory Bowel Diseases. Clin Gastroenterol Hepatol. 2014.Abstract
BACKGROUND & AIMS: Crohn's disease and ulcerative colitis are associated with an increased risk of colorectal cancer (CRC). Surveillance colonoscopy is recommended at 2- to 3-year intervals beginning 8 years after diagnosis of inflammatory bowel disease (IBD). However, there have been no reports of whether colonoscopy examination reduces the risk for CRC in patients with IBD. METHODS: In a retrospective study, we analyzed data from 6823 patients with IBD (2764 with a recent colonoscopy, 4059 without a recent colonoscopy) seen and followed up for at least 3 years at 2 tertiary referral hospitals in Boston, Massachusetts. The primary outcome was diagnosis of CRC. We examined the proportion of patients undergoing a colonoscopy within 36 months before a diagnosis of CRC or at the end of the follow-up period, excluding colonoscopies performed within 6 months before a diagnosis of CRC, to avoid inclusion of prevalent cancers. Multivariate logistic regression was performed, adjusting for plausible confounders. RESULTS: A total of 154 patients developed CRC. The incidence of CRC among patients without a recent colonoscopy (2.7%) was significantly higher than among patients with a recent colonoscopy (1.6%) (odds ratio [OR], 0.56; 95% confidence interval [CI], 0.39-0.80). This difference persisted in multivariate analysis (OR, 0.65; 95% CI, 0.45-0.93) and was robust when adjusted for a range of assumptions in sensitivity analyses. Among patients with CRC, a colonoscopy within 6 to 36 months before diagnosis was associated with a reduced mortality rate (OR, 0.34; 95% CI, 0.12-0.95). CONCLUSIONS: Recent colonoscopy (within 36 months) is associated with a reduced incidence of CRC in patients with IBD, and lower mortality rates in those diagnosed with CRC.
2013
Zimolzak AJ, Spettell CM, Fernandes J, Fusaro VA, Palmer NP, Saria S, Kohane IS, Jonikas MA, Mandl KD. Early detection of poor adherers to statins: applying individualized surveillance to pay for performance. PLoS OnePLoS ONEPLoS ONE. 2013;8 :e79611.Abstract
BACKGROUND: Medication nonadherence costs $300 billion annually in the US. Medicare Advantage plans have a financial incentive to increase medication adherence among members because the Centers for Medicare and Medicaid Services (CMS) now awards substantive bonus payments to such plans, based in part on population adherence to chronic medications. We sought to build an individualized surveillance model that detects early which beneficiaries will fall below the CMS adherence threshold. METHODS: This was a retrospective study of over 210,000 beneficiaries initiating statins, in a database of private insurance claims, from 2008-2011. A logistic regression model was constructed to use statin adherence from initiation to day 90 to predict beneficiaries who would not meet the CMS measure of proportion of days covered 0.8 or above, from day 91 to 365. The model controlled for 15 additional characteristics. In a sensitivity analysis, we varied the number of days of adherence data used for prediction. RESULTS: Lower adherence in the first 90 days was the strongest predictor of one-year nonadherence, with an odds ratio of 25.0 (95% confidence interval 23.7-26.5) for poor adherence at one year. The model had an area under the receiver operating characteristic curve of 0.80. Sensitivity analysis revealed that predictions of comparable accuracy could be made only 40 days after statin initiation. When members with 30-day supplies for their first statin fill had predictions made at 40 days, and members with 90-day supplies for their first fill had predictions made at 100 days, poor adherence could be predicted with 86% positive predictive value. CONCLUSIONS: To preserve their Medicare Star ratings, plan managers should identify or develop effective programs to improve adherence. An individualized surveillance approach can be used to target members who would most benefit, recognizing the tradeoff between improved model performance over time and the advantage of earlier detection.
Namjou B, Keddache M, Marsolo K, Wagner M, Lingren T, Cobb B, Perry C, Kennebeck S, Holm IA, Li R, et al. EMR-linked GWAS study: investigation of variation landscape of loci for body mass index in children. Front Genet. 2013;4 :268.Abstract
Common variations at the loci harboring the fat mass and obesity gene (FTO), MC4R, and TMEM18 are consistently reported as being associated with obesity and body mass index (BMI) especially in adult population. In order to confirm this effect in pediatric population five European ancestry cohorts from pediatric eMERGE-II network (CCHMC-BCH) were evaluated. METHOD: Data on 5049 samples of European ancestry were obtained from the Electronic Medical Records (EMRs) of two large academic centers in five different genotyped cohorts. For all available samples, gender, age, height, and weight were collected and BMI was calculated. To account for age and sex differences in BMI, BMI z-scores were generated using 2000 Centers of Disease Control and Prevention (CDC) growth charts. A Genome-wide association study (GWAS) was performed with BMI z-score. After removing missing data and outliers based on principal components (PC) analyses, 2860 samples were used for the GWAS study. The association between each single nucleotide polymorphism (SNP) and BMI was tested using linear regression adjusting for age, gender, and PC by cohort. The effects of SNPs were modeled assuming additive, recessive, and dominant effects of the minor allele. Meta-analysis was conducted using a weighted z-score approach. RESULTS: The mean age of subjects was 9.8 years (range 2-19). The proportion of male subjects was 56%. In these cohorts, 14% of samples had a BMI >/=95 and 28 >/= 85%. Meta analyses produced a signal at 16q12 genomic region with the best result of p = 1.43 x 10(-) (7) [p (rec) = 7.34 x 10(-) (8)) for the SNP rs8050136 at the first intron of FTO gene (z = 5.26) and with no heterogeneity between cohorts (p = 0.77). Under a recessive model, another published SNP at this locus, rs1421085, generates the best result [z = 5.782, p (rec) = 8.21 x 10(-) (9)]. Imputation in this region using dense 1000-Genome and Hapmap CEU samples revealed 71 SNPs with p < 10(-) (6), all at the first intron of FTO locus. When hetero-geneity was permitted between cohorts, signals were also obtained in other previously identified loci, including MC4R (rs12964056, p = 6.87 x 10(-) (7), z = -4.98), cholecystokinin CCK (rs8192472, p = 1.33 x 10(-) (6), z = -4.85), Interleukin 15 (rs2099884, p = 1.27 x 10(-) (5), z = 4.34), low density lipoprotein receptor-related protein 1B [LRP1B (rs7583748, p = 0.00013, z = -3.81)] and near transmembrane protein 18 (TMEM18) (rs7561317, p = 0.001, z = -3.17). We also detected a novel locus at chromosome 3 at COL6A5 [best SNP = rs1542829, minor allele frequency (MAF) of 5% p = 4.35 x 10(-) (9), z = 5.89]. CONCLUSION: An EMR linked cohort study demonstrates that the BMI-Z measurements can be successfully extracted and linked to genomic data with meaningful confirmatory results. We verified the high prevalence of childhood rate of overweight and obesity in our cohort (28%). In addition, our data indicate that genetic variants in the first intron of FTO, a known adult genetic risk factor for BMI, are also robustly associated with BMI in pediatric population.
Weber GM, Kohane IS. Extracting physician group intelligence from electronic health records to support evidence based medicine. PLoS OnePLoS ONEPLoS ONE. 2013;8 :e64933.Abstract
Evidence-based medicine employs expert opinion and clinical data to inform clinical decision making. The objective of this study is to determine whether it is possible to complement these sources of evidence with information about physician "group intelligence" that exists in electronic health records. Specifically, we measured laboratory test "repeat intervals", defined as the amount of time it takes for a physician to repeat a test that was previously ordered for the same patient. Our assumption is that while the result of a test is a direct measure of one marker of a patient's health, the physician's decision to order the test is based on multiple factors including past experience, available treatment options, and information about the patient that might not be coded in the electronic health record. By examining repeat intervals in aggregate over large numbers of patients, we show that it is possible to 1) determine what laboratory test results physicians consider "normal", 2) identify subpopulations of patients that deviate from the norm, and 3) identify situations where laboratory tests are over-ordered. We used laboratory tests as just one example of how physician group intelligence can be used to support evidence based medicine in a way that is automated and continually updated.
McMurry AJ, Fitch B, Savova G, Kohane IS, Reis BY. Improved de-identification of physician notes through integrative modeling of both public and private medical text. BMC Med Inform Decis MakBMC medical informatics and decision makingBMC medical informatics and decision making. 2013;13 :112.Abstract
BACKGROUND: Physician notes routinely recorded during patient care represent a vast and underutilized resource for human disease studies on a population scale. Their use in research is primarily limited by the need to separate confidential patient information from clinical annotations, a process that is resource-intensive when performed manually. This study seeks to create an automated method for de-identifying physician notes that does not require large amounts of private information: in addition to training a model to recognize Protected Health Information (PHI) within private physician notes, we reverse the problem and train a model to recognize non-PHI words and phrases that appear in public medical texts. METHODS: Public and private medical text sources were analyzed to distinguish common medical words and phrases from Protected Health Information. Patient identifiers are generally nouns and numbers that appear infrequently in medical literature. To quantify this relationship, term frequencies and part of speech tags were compared between journal publications and physician notes. Standard medical concepts and phrases were then examined across ten medical dictionaries. Lists and rules were included from the US census database and previously published studies. In total, 28 features were used to train decision tree classifiers. RESULTS: The model successfully recalled 98% of PHI tokens from 220 discharge summaries. Cost sensitive classification was used to weight recall over precision (98% F10 score, 76% F1 score). More than half of the false negatives were the word "of" appearing in a hospital name. All patient names, phone numbers, and home addresses were at least partially redacted. Medical concepts such as "elevated white blood cell count" were informative for de-identification. The results exceed the previously approved criteria established by four Institutional Review Boards. CONCLUSIONS: The results indicate that distributional differences between private and public medical text can be used to accurately classify PHI. The data and algorithms reported here are made freely available for evaluation and improvement.
Ripke S, Wray NR, Lewis CM, Hamilton SP, Weissman MM, Breen G, Byrne EM, Blackwood DH, Boomsma DI, Cichon S, et al. A mega-analysis of genome-wide association studies for major depressive disorder. Molecular psychiatryMolecular psychiatryMolecular psychiatry. 2013;18 :497-511.Abstract
Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 x 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 x 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.
Xia Z, Secor E, Chibnik LB, Bove RM, Cheng S, Chitnis T, Cagan A, Gainer VS, Chen PJ, Liao KP, et al. Modeling disease severity in multiple sclerosis using electronic health records. PLoS OnePLoS ONEPLoS ONE. 2013;8 :e78927.Abstract
OBJECTIVE: To optimally leverage the scalability and unique features of the electronic health records (EHR) for research that would ultimately improve patient care, we need to accurately identify patients and extract clinically meaningful measures. Using multiple sclerosis (MS) as a proof of principle, we showcased how to leverage routinely collected EHR data to identify patients with a complex neurological disorder and derive an important surrogate measure of disease severity heretofore only available in research settings. METHODS: In a cross-sectional observational study, 5,495 MS patients were identified from the EHR systems of two major referral hospitals using an algorithm that includes codified and narrative information extracted using natural language processing. In the subset of patients who receive neurological care at a MS Center where disease measures have been collected, we used routinely collected EHR data to extract two aggregate indicators of MS severity of clinical relevance multiple sclerosis severity score (MSSS) and brain parenchymal fraction (BPF, a measure of whole brain volume). RESULTS: The EHR algorithm that identifies MS patients has an area under the curve of 0.958, 83% sensitivity, 92% positive predictive value, and 89% negative predictive value when a 95% specificity threshold is used. The correlation between EHR-derived and true MSSS has a mean R(2) = 0.38+/-0.05, and that between EHR-derived and true BPF has a mean R(2) = 0.22+/-0.08. To illustrate its clinical relevance, derived MSSS captures the expected difference in disease severity between relapsing-remitting and progressive MS patients after adjusting for sex, age of symptom onset and disease duration (p = 1.56x10(-12)). CONCLUSION: Incorporation of sophisticated codified and narrative EHR data accurately identifies MS patients and provides estimation of a well-accepted indicator of MS severity that is widely used in research settings but not part of the routine medical records. Similar approaches could be applied to other complex neurological disorders.
Overby CL, Kohane I, Kannry JL, Williams MS, Starren J, Bottinger E, Gottesman O, Denny JC, Weng C, Tarczy-Hornoch P, et al. Opportunities for genomic clinical decision support interventions. Genet MedGenet MedGenetics in medicine : official journal of the American College of Medical Genetics. 2013;15 :817-23.
McMurry AJ, Murphy SN, Macfadden D, Weber G, Simons WW, Orechia J, Bickel J, Wattanasin N, Gilbert C, Trevvett P, et al. SHRINE: enabling nationally scalable multi-site disease studies. PloS onePLoS ONEPLoS ONE. 2013;8 :e55811.Abstract
Results of medical research studies are often contradictory or cannot be reproduced. One reason is that there may not be enough patient subjects available for observation for a long enough time period. Another reason is that patient populations may vary considerably with respect to geographic and demographic boundaries thus limiting how broadly the results apply. Even when similar patient populations are pooled together from multiple locations, differences in medical treatment and record systems can limit which outcome measures can be commonly analyzed. In total, these differences in medical research settings can lead to differing conclusions or can even prevent some studies from starting. We thus sought to create a patient research system that could aggregate as many patient observations as possible from a large number of hospitals in a uniform way. We call this system the 'Shared Health Research Information Network', with the following properties: (1) reuse electronic health data from everyday clinical care for research purposes, (2) respect patient privacy and hospital autonomy, (3) aggregate patient populations across many hospitals to achieve statistically significant sample sizes that can be validated independently of a single research setting, (4) harmonize the observation facts recorded at each institution such that queries can be made across many hospitals in parallel, (5) scale to regional and national collaborations. The purpose of this report is to provide open source software for multi-site clinical studies and to report on early uses of this application. At this time SHRINE implementations have been used for multi-site studies of autism co-morbidity, juvenile idiopathic arthritis, peripartum cardiomyopathy, colorectal cancer, diabetes, and others. The wide range of study objectives and growing adoption suggest that SHRINE may be applicable beyond the research uses and participating hospitals named in this report.
Uno H, Tian L, Cai T, Kohane IS, Wei LJ. A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data. Stat MedStat Med. 2013;32 :2430-42.Abstract
Risk prediction procedures can be quite useful for the patient's treatment selection, prevention strategy, or disease management in evidence-based medicine. Often, potentially important new predictors are available in addition to the conventional markers. The question is how to quantify the improvement from the new markers for prediction of the patient's risk in order to aid cost-benefit decisions. The standard method, using the area under the receiver operating characteristic curve, to measure the added value may not be sensitive enough to capture incremental improvements from the new markers. Recently, some novel alternatives to area under the receiver operating characteristic curve, such as integrated discrimination improvement and net reclassification improvement, were proposed. In this paper, we consider a class of measures for evaluating the incremental values of new markers, which includes the preceding two as special cases. We present a unified procedure for making inferences about measures in the class with censored event time data. The large sample properties of our procedures are theoretically justified. We illustrate the new proposal with data from a cancer study to evaluate a new gene score for prediction of the patient's survival. Copyright (c) 2012 John Wiley & Sons, Ltd.

Pages