Jose Dominguez, Martin J. Boeree, Dumitru Chesov, Francesca Conradie, Vivian Cox, Keertan Dheda, Andrii Dudnyk, Maha R. Farhat, Sebastien Gagneux, Martin P. Grobusch, Matthias Groeschel, Lorenzo Guglielmetti, Irina Kontsevaya, Berit Lange, Frank van Leth, Christian Lienhardt, Anna M. Mandalakas, Florian Maurer, Matthias Merker, Paolo Miotto, Barbara Molina-Moya, Florence Morel, Stefan Niemann, Nicolas Veziris, Andrew Whitelaw, Charles R. Horsburgh, and Christoph Lange. Forthcoming. “Clinical implications of molecular drug resistance testing for Mycobacterium tuberculosis: a 2023 TBnet/RESIST-TB consensus statement by Dr Jose Dominguez.” The Lancet Infectious Diseases.
T Ness, L Meiwes, A Kay, R Mejia, Christoph Lange, Maha R Farhat, Anna Mandalakas, and A DiNardo. 11/28/2022. “Optimizing DNA Extraction from Pediatric Stool for Diagnosing Tuberculosis and Use in Next Generation Sequencing Applications.” American Society for Microbiology. Publisher's Version
Tara Ness, Andrew DiNardo, and Maha R. Farhat. 11/14/2022. “Mycobacterium tuberculosis Infection: Policy Gaps and Research Breakthroughs.” Pathogens. Publisher's Version
V Mave, L. Chen, U Ranganathan, D Kadam, V Vishwanathan, R Lokhande, S. Kumar, A Kagal, N Pradhan, S Yogendra Shivakumar, M Paradkar, S Deshmukh, J Tornheim, H Kornfeld, Maha R Farhat, A. Gupta, C Padmapriyadarsini, N Gupte, J Golub, B Mathema, and Barry N Kreiswirth. 9/14/2022. “Whole Genome Sequencing Assessing Impact of Diabetes Mellitus on Tuberculosis Mutations and Type of Recurrence in India.” Clin Infect Dis. Publisher's Version
Evan Koch, Justin Du, Michelle Dressner, Hashmeya Erahim Alwasti, Zahra Al Taif, Fatima Shehab, Afaf Mohamed, Amjad Ghanem, Alireza Haghighi, Shamil Sunyaev, and Maha R Farhat. 8/16/2022. “Demographic and Viral-Genetic Analyses of COVID-19 Severity in Bahrain Identify Local Risk Factors and a Protective Effect of Polymerase Mutations.” medRxiv. Publisher's Version
Matthias I Gröschel, Francy J. Pérez-Llanos, Roland Diel, Roger Vargas Jr, Vincent Escuyer, Kimberlee Musser, Lisa Trieu, Jeanne Sullivan Meissner, J Knoor, Don Klinkenberg, Peter Kouw, Susanne Homolka, Wojciech Samek, Barun Mathema, Dick van Soolingen, Stefan Niemann, Shama Ahuja, and Maha R Farhat. 8/5/2022. “Host-pathogen co-adaptation shapes susceptibility to infection with Mycobacterium tuberculosis.” medRxiv. Publisher's Version
A.G Green, Chang Ho Yoon, Michael Chen, Yasha Ektefaie, Mack Fina, Luca Freschi, Matthias I. Groschel, Kohane I, Andrew Beam, and Maha R Farhat. 7/2/2022. “A convolutional neural network highlights mutations relevant to antimicrobial resistance in Mycobacterium tuberculosis.” Nature Communications, 13, 3817. Publisher's Version
Claudio Koser Maha Farhat Roger et al including and Vargas. 4/14/2022. “Updating the approaches to define susceptibility and resistance to antituberculosis agents: implications for diagnosis and treatment.” Eur Respir J. Publisher's Version
M Marin, R Vargas, M. Harris, L Epperson, B Jeffrey, D Durbin, M Strong, M Salfinger, Z Iqbal, I Akhundova, S Vashakidze, V Crudu, A Rosenthal, and Maha R Farhat. 4/1/2022. “Benchmarking the empirical accuracy of short-read sequencing across the M. tuberculosis genome.” ISCB, 38, 7, Pp. 1781–1787,. Publisher's Version
Matthias I. Groschel, Van Den Boom, A Dixit, A Skrahina, P.J. Dodd, G.B Migliori, J.A. Seddon, and Maha R Farhat. 3/14/2022. “Management of childhood MDR-TB in Europe and Central Asia: report of a Regional WHO meeting.” International Journal of Tuberculosis and Lung Disease, 26, 5. Publisher's Version
Sachin Atre, J Jagtap, Mujtaba Faqih, Yogita Dumbare, T Sawant, S Ambike, J Bhawalkar, Sandeep Bhawalkar, P Jogewar, J Adkekar, B Hodgar, V Jadhav, N Mokashi, J Golub, V Dixit, and Maha R Farhat. 1/15/2022. “Tuberculosis Pathways to Care and Transmission of Multidrug Resistance in India.” American Journal of Respiratory and Critical Care Medicine , 205, 2. Publisher's Version
A Dixit, A Kagal, Yasha Ektefaie, L Freschi, K Karyakarte, R Lokhande, M Groschel, J Tornheim, N Gupte, N Pradhan, M Paradkar, S Deshmukh, D Kadam, M Schito, D Engelthaler, A Gupta, J Golub, V Mave, and Maha R Farhat. 1/5/2022. “Modern lineages of Mycobacterium tuberculosis were recently introduced in western India and demonstrate increased transmissibility.” medRxiv. Publisher's Version
Avika Dixit, Luca Freschi, Vargas R., Matthias Groeschel, Sabira Tahseen, SM Masud Alam, SM Mostofa Kamal, Alena Skrahina, Ramon Basilio, Dodge Lim, Nazir Ismail, and Maha R Farhat. 9/27/2021. “Estimation of country-specific tuberculosis antibiograms using genomic data.” medRxiv. Publisher's Version
Groschel M, Owens M, Freschi L, Vargas R, Marin M, Phelan J, Iqbal Z, Dixit A, and Farhat MR. 8/30/2021. “Gen TB: A user-friendly genome-based predictor of tuberculosis resistance powered by machine learning.” Genome Medicine. Publisher's VersionAbstract



Multidrug-resistant Mycobacterium tuberculosis (Mtb) is a significant global public health threat. Genotypic resistance prediction from Mtb DNA sequences offers an alternative to laboratory-based drug-susceptibility testing. User-friendly and accurate resistance prediction tools are needed to enable public health and clinical practitioners to rapidly diagnose resistance and inform treatment regimens.


We present Translational Genomics platform for Tuberculosis (GenTB), a free and open web-based application to predict antibiotic resistance from next-generation sequence data. The user can choose between two potential predictors, a Random Forest (RF) classifier and a Wide and Deep Neural Network (WDNN) to predict phenotypic resistance to 13 and 10 anti-tuberculosis drugs, respectively. We benchmark GenTB’s predictive performance along with leading TB resistance prediction tools (Mykrobe and TB-Profiler) using a ground truth dataset of 20,408 isolates with laboratory-based drug susceptibility data. All four tools reliably predicted resistance to first-line tuberculosis drugs but had varying performance for second-line drugs. The mean sensitivities for GenTB-RF and GenTB-WDNN across the nine shared drugs were 77.6% (95% CI 76.6–78.5%) and 75.4% (95% CI 74.5–76.4%), respectively, and marginally higher than the sensitivities of TB-Profiler at 74.4% (95% CI 73.4–75.3%) and Mykrobe at 71.9% (95% CI 70.9–72.9%). The higher sensitivities were at an expense of ≤ 1.5% lower specificity: Mykrobe 97.6% (95% CI 97.5–97.7%), TB-Profiler 96.9% (95% CI 96.7 to 97.0%), GenTB-WDNN 96.2% (95% CI 96.0 to 96.4%), and GenTB-RF 96.1% (95% CI 96.0 to 96.3%). Averaged across the four tools, genotypic resistance sensitivity was 11% and 9% lower for isoniazid and rifampicin respectively, on isolates sequenced at low depth (< 10× across 95% of the genome) emphasizing the need to quality control input sequence data before prediction. We discuss differences between tools in reporting results to the user including variants underlying the resistance calls and any novel or indeterminate variants


GenTB is an easy-to-use online tool to rapidly and accurately predict resistance to anti-tuberculosis drugs. GenTB can be accessed online at, and the source code is available at



Vargas R, Freschi L, Spitaleri A, Tahseen S, Barilar I, Neimann S, Miotto P, Cirillo D, Koser C, and Farhat MR. 8/30/2021. “The role of epistasis in amikacin, kanamycin, bedaquiline, and clofazimine resistance in Mycobacterium tuberculosis complex.” Antimicrobial Agents and Chemotherapy. Publisher's VersionAbstract


Antibiotic resistance among bacterial pathogens poses a major global health threat. M. tuberculosis complex (MTBC) is estimated to have the highest resistance rates of any pathogen globally. Given the slow growth rate and the need for a biosafety level 3 laboratory, the only realistic avenue to scale up drug susceptibility testing (DST) for this pathogen is to rely on genotypic techniques. This raises the fundamental question of whether a mutation is a reliable surrogate for phenotypic resistance or whether the presence of a second mutation can completely counteract its effect, resulting in major diagnostic errors (i.e. systematic false resistance results). To date, such epistatic interactions have only been reported for streptomycin that is now rarely used. By analyzing more than 31,000 MTBC genomes, we demonstrated that the eis C-14T promoter mutation, which is interrogated by several genotypic DST assays endorsed by the World Health Organization, cannot confer resistance to amikacin and kanamycin if it coincides with loss-of-function (LoF) mutations in the coding region of eis. To our knowledge, this represents the first definitive example of antibiotic reversion in MTBC. Moreover, we raise the possibility that mmpR (Rv0678) mutations are not valid markers of resistance to bedaquiline and clofazimine if these coincide with a LoF mutation in the efflux pump encoded by mmpS5 (Rv0677c) and mmpL5 (Rv0676c).

Kaafarani H, Gaitanidis A, Farhat MR., Christensen M, Breen K, Mendoza A, Fagenholz P, and Velmahos G. 7/28/2021. “Association Between NEDD4L Variation and the Genetic Risk of Acute Appendicitis, A Multi-institutional Genome-Wide Association Study.” JAMA Surgery. Publisher's VersionAbstract

Importance  The familial aspect of acute appendicitis (AA) has been proposed, but its hereditary basis remains undetermined.

Objective  To identify genomic variants associated with AA.

Design, Setting, and Participants  This genome-wide association study, conducted from June 21, 2019, to February 4, 2020, used a multi-institutional biobank to retrospectively identify patients with AA across 8 single-nucleotide variation (SNV) genotyping batches. The study also examined differential gene expression in appendiceal tissue samples between patients with AA and controls using the GSE9579 data set in the National Institutes of Health’s Gene Expression Omnibus repository. Statistical analysis was conducted from October 1, 2019, to February 4, 2020.

Main Outcomes and Measures  Single-nucleotide variations with a minor allele frequency of 5% or higher were tested for association with AA using a linear mixed model. The significance threshold was set at P = 5 × 10−8.

Results  A total of 29 706 patients (15 088 women [50.8%]; mean [SD] age at enrollment, 60.1 [17.0] years) were included, 1743 of whom had a history of AA. The genomic inflation factor for the cohort was 1.003. A previously unknown SNV at chromosome 18q was found to be associated with AA (rs9953918: odds ratio, 0.99; 95% CI, 0.98-1.00; P = 4.48 × 10−8). This SNV is located in an intron of the NEDD4L gene. The heritability of appendicitis was estimated at 30.1%. Gene expression data from appendiceal tissue donors identified NEDD4L to be among the most differentially expressed genes (14 of 22 216 genes; β [SE] = −2.71 [0.44]; log fold change = −1.69; adjusted P = .04).

Conclusions and Relevance  This study identified SNVs within the NEDD4L gene as being associated with AA. Nedd4l is involved in the ubiquitination of intestinal ion channels and decreased Nedd4l activity may be implicated in the pathogenesis of AA. These findings can improve the understanding of the genetic predisposition to and pathogenesis of AA.

O Alser, A Mokhtari, L Naar, K Langeveld, K Breen, and Maha R Farhat. 5/2021. “Multisystem outcomes and predictors of mortality in critically ill patients with COVID-19: Demographics and disease acuity matter more than comorbidities or treatment modalities, Journal of Trauma and Acute Care Surgery.” The Journal of Trauma and Acute Care Therapy, 90, 5, Pp. 880-890. Publisher's Version
M Marin, R Vargas, M Harris, B Jeffrey, L.E Epperson, D Durbin, M Strong, M Salfinger, Z Iqbal, I Akhundova, S Vashakidze, V Crudu, A Rosenthal, and MR Farhat. 4/8/2021. “Genomic sequence characteristics and the empiric accuracy of short-read sequencing.” bioRxiv. Publisher's VersionAbstract
Background: Short-read whole genome sequencing (WGS) is a vital tool for clinical applications and basic research. Genetic divergence from the reference genome, repetitive sequences, and sequencing bias, reduce the performance of variant calling using short-read alignment, but the loss in recall and specificity has not been adequately characterized. For the clonal pathogen Mycobacterium tuberculosis (Mtb), researchers frequently exclude 10.7% of the genome believed to be repetitive and prone to erroneous variant calls. To benchmark short-read variant calling, we used 36 diverse clinical Mtb isolates dually sequenced with Illumina short-reads and PacBio long-reads. We systematically study the short-read variant calling accuracy and the influence of sequence uniqueness, reference bias, and GC content. Results: Reference based Illumina variant calling had a recall ≥89.0% and precision ≥98.5% across parameters evaluated. The best balance between precision and recall was achieved by tuning the mapping quality (MQ) threshold, i.e. confidence of the read mapping (recall 85.8%, precision 99.1% at MQ ≥ 40). Masking repetitive sequence content is an alternative conservative approach to variant calling that maintains high precision (recall 70.2%, precision 99.6% at MQ≥40). Of the genomic positions typically excluded for Mtb, 68% are accurately called using Illumina WGS including 52 of the 168 PE/PPE genes (34.5%). We present a refined list of low confidence regions and examine the largest sources of variant calling error. Conclusions: Our improved approach to variant calling has broad implications for the use of WGS in the study of Mtb biology, inference of transmission in public health surveillance systems, and more generally for WGS applications in other organisms.
J El Halabi, NP Palmer, K Fox, JE Golub, I Kohane, and Maha R Farhat. 3/23/2021. “Measuring healthcare delays among privately insured tuberculosis patients in the United States: an observational cohort study.” Lancet Infectious Diseases. Publisher's Version
Groschel M and Farhat MR. 3/16/2021. “A Legacy of disease.” The Lancet, 21, 3. Publisher's Version