Mycobacterium tuberculosis is a clonal pathogen proposed to have co-evolved with its human host for millennia, yet our understanding of its genomic diversity and biogeography remains incomplete. Here we use a combination of phylogenetics and dimensionality reduction to reevaluate the population structure of M. tuberculosis, providing the first in-depth analysis of the ancient East African Indian Lineage 1 and the modern Central Asian Lineage 3 and expanding our understanding of Lineages 2 and 4. We assess sub-lineages using genomic sequences from 4,939 pan-susceptible strains and find 30 new genetically distinct clades that we validate in a dataset of 4,645 independent isolates. We characterize sub-lineage geographic distributions and demonstrate a consistent geographically restricted and unrestricted pattern for 20 groups, including three groups of Lineage 1. We assess the transmissibility of the four major lineages by examining the distribution of terminal branch lengths across the M. tuberculosis phylogeny and identify evidence supporting higher transmissibility in Lineages 2 and 4 than 3 and 1 on a global scale. We define a robust expanded barcode of 95 single nucleotide substitutions (SNS) that allows for the rapid identification of 69 Mtb sub-lineages and 26 additional internal groups. Our results paint a higher resolution picture of the Mtb phylogeny and biogeography.
Coronavirus disease 2019 (COVID-19) appears to have significant extrapulmonary complications affecting multiple organ systems.1-3 Critically ill patients with COVID-19 often develop gastrointestinal complications during their hospital stay, including bowel ischemia, transaminitis, gastrointestinal bleeding, pancreatitis, Ogilvie syndrome, and severe ileus.3 Whether the high incidence of gastrointestinal complications is a manifestation of critical illness in general or is specific to COVID-19 remains unclear. We compared the incidence of gastrointestinal complications of critically ill patients with COVID-19–induced acute respiratory distress syndrome (ARDS) vs comparably ill patients with non–COVID-19 ARDS using propensity score analysis.
Lemieux J, Siddle KJ, Shaw BM, Loreth C, Schaffner S, Gladden-Young A, Adams G, Fink T, Tomkins-Tinch CH, Krasilnikova LA, Deruff KC, Rudy M, Bauer MR, Lagerborg KA, Normandin E, Chapman SB, Reilly SK, Anahtar MN, Lin AE, Carter A, Myhrvold C, Kemball M, Chaluvadi SR, Cusick C, Flowers K, Neumann A, Cerrato F, Farhat MR, Slater D, Harris JB, Branda J, Hooper D, Gaeta JM, Bagett TP, O'Connel J, Gnirke A, Lieberman TB, Philippakis A, Burns M, Brown C, Luban J, Ryan ET, Turbett SE, LaRocque RC, Hanage WP, Gallagher G, Madoff LC, Smole S, Pierce VM, Rosenburg ES, Sabeti S, Park DJ, and MacInnis BL. 8/25/2020. “Phylogenetic analysis of SARS-CoV-2 in the Boston area highlights the role of recurrent importation and superspreading events ”. Publisher's VersionAbstract
SARS-CoV-2 has caused a severe, ongoing outbreak of COVID-19 in Massachusetts with 111,070 confirmed cases and 8,433 deaths as of August 1, 2020. To investigate the introduction, spread, and epidemiology of COVID-19 in the Boston area, we sequenced and analyzed 772 complete SARS-CoV-2 genomes from the region, including nearly all confirmed cases within the first week of the epidemic and hundreds of cases from major outbreaks at a conference, a nursing facility, and among homeless shelter guests and staff. The data reveal over 80 introductions into the Boston area, predominantly from elsewhere in the United States and Europe. We studied two superspreading events covered by the data, events that led to very different outcomes because of the timing and populations involved. One produced rapid spread in a vulnerable population but little onward transmission, while the other was a major contributor to sustained community transmission, including outbreaks in homeless populations, and was exported to several other domestic and international sites. The same two events differed significantly in the number of new mutations seen, raising the possibility that SARS-CoV-2 superspreading might encompass disparate transmission dynamics. Our results highlight the failure of measures to prevent importation into MA early in the outbreak, underscore the role of superspreading in amplifying an outbreak in a major urban area, and lay a foundation for contact tracing informed by genetic data.
Improved genetic understanding of Mycobacterium tuberculosis (MTB) resistance to novel and repurposed anti-tubercular agents can aid the development of rapid molecular diagnostics.
Adhering to PRISMA guidelines, in March 2018, we performed a systematic review of studies implicating mutations in resistance through sequencing and phenotyping before and/or after spontaneous resistance evolution, as well as allelic exchange experiments. We focused on the novel drugs bedaquiline, delamanid, pretomanid and the repurposed drugs clofazimine and linezolid. A database of 1373 diverse control MTB whole genomes, isolated from patients not exposed to these drugs, was used to further assess genotype–phenotype associations.
Of 2112 papers, 54 met the inclusion criteria. These studies characterized 277 mutations in the genes atpE, mmpR, pepQ, Rv1979c, fgd1, fbiABC and ddn and their association with resistance to one or more of the five drugs. The most frequent mutations for bedaquiline, clofazimine, linezolid, delamanid and pretomanid resistance were atpE A63P, mmpR frameshifts at nucleotides 192–198, rplC C154R, ddn W88* and ddn S11*, respectively. Frameshifts in the mmpR homopolymer region nucleotides 192–198 were identified in 52/1373 (4%) of the control isolates without prior exposure to bedaquiline or clofazimine. Of isolates resistant to one or more of the five drugs, 59/519 (11%) lacked a mutation explaining phenotypic resistance.
This systematic review supports the use of molecular methods for linezolid resistance detection. Resistance mechanisms involving non-essential genes show a diversity of mutations that will challenge molecular diagnosis of bedaquiline and nitroimidazole resistance. Combined phenotypic and genotypic surveillance is needed for these drugs in the short term.
Recent studies portend a rising global spread and adaptation of human- or healthcare-associated pathogens. Here, we analysed an international collection of the emerging, multidrug-resistant, opportunistic pathogen Stenotrophomonas maltophilia from 22 countries to infer population structure and clonality at a global level. We show that the S. maltophilia complex is divided into 23 monophyletic lineages, most of which harboured strains of all degrees of human virulence. Lineage Sm6 comprised the highest rate of human-associated strains, linked to key virulence and resistance genes. Transmission analysis identified potential outbreak events of genetically closely related strains isolated within days or weeks in the same hospitals.
One Sentence Summary The S. maltophilia complex comprises genetically diverse, globally distributed lineages with evidence for intra-hospital transmission.
The dN/dS ratio provides evidence of adaptation or functional constraint in protein-coding genes by quantifying the relative excess or deficit of amino acid-replacing versus silent nucleotide variation. Inexpensive sequencing promises a better understanding of parameters such as dN/dS, but analysing very large datasets poses a major statistical challenge. Here I introduce genomegaMap for estimating within-species genome-wide variation in dN/dS, and I apply it to 3,979 genes across 10,209 tuberculosis genomes to characterize the selection pressures shaping this global pathogen. GenomegaMap is a phylogeny-free method that addresses two major problems with existing approaches: (i) it is fast no matter how large the sample size and (ii) it is robust to recombination, which causes phylogenetic methods to report artefactual signals of adaptation. GenomegaMap uses population genetics theory to approximate the distribution of allele frequencies under general, parent-dependent mutation models. Coalescent simulations show that substitution parameters are well-estimated even when genomegaMap’s simplifying assumption of independence among sites is violated. I demonstrate the ability of genomegaMap to detect genuine signatures of selection at antimicrobial resistance-conferring substitutions in M. tuberculosis and describe a novel signature of selection in the cold-shock DEAD-box protein A gene deaD/csdA. The genomegaMap approach helps accelerate the exploitation of big data for gaining new insights into evolution within species.
The genomic landscape of gallbladder disease remains poorly understood. We sought to examine the association between genetic variants and the development of cholecystitis.
The Biobank of a large multi-institutional healthcare system was utilized. All patients with cholecystitis were identified using ICD-10 codes and genotyped across 6 batches. To control for population stratification, data was restricted to that from individuals of European genomic ancestry using a multidimensional scaling (MDS) approach. The association between single nucleotide polymorphisms (SNPs) and cholecystitis was evaluated with a mixed linear model-based analysis, controlling for age, sex and obesity. The threshold for significance was set at 5 × 10-8.
Out of 24,635 patients (mean age 60.1 ± 16.7 years, 13,022 [52.9%] females), 900 had cholecystitis (mean age 65.4 ± 14.3 years, 496 [55.1%] females). After meta-analysis, 3 SNPs on chromosome 5p15 exceeded the threshold for significance (p < 5 × 10-8). The phenotypic variance of cholecystitis explained by genetics and controlling for gender and obesity was estimated to be 17.9%.
Using a multi-institutional genomic Biobank, we report a region on chromosome 5p15 is associated with the development of cholecystitis that can be used to identify patients at risk.
Two billion people are infected with Mycobacterium tuberculosis, leading to 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testing (DST) has become a World Health Organization priority. We previously developed a software tool, Mykrobe predictor, which provided offline species identification and drug resistance predictions for M. tuberculosis from whole genome sequencing (WGS) data. Performance was insufficient to support the use of WGS as an alternative to conventional phenotype-based DST, due to mutation catalogue limitations.
Here we present a new tool, Mykrobe, which provides the same functionality based on a new software implementation. Improvements include i) an updated mutation catalogue giving greater sensitivity to detect pyrazinamide resistance, ii) support for user-defined resistance catalogues, iii) improved identification of non-tuberculous mycobacterial species, and iv) an updated statistical model for Oxford Nanopore Technologies sequencing data. Mykrobe is released under MIT license at https://github.com/mykrobe-tools/mykrobe. We incorporate mutation catalogues from the CRyPTIC consortium et al. (2018) and from Walker et al. (2015), and make improvements based on performance on an initial set of 3206 and an independent set of 5845 M. tuberculosis Illumina sequences. To give estimates of error rates, we use a prospectively collected dataset of 4362 M. tuberculosis isolates. Using culture based DST as the reference, we estimate Mykrobe to be 100%, 95%, 82%, 99% sensitive and 99%, 100%, 99%, 99% specific for rifampicin, isoniazid, pyrazinamide and ethambutol resistance prediction respectively. We benchmark against four other tools on 10207 (=5845+4362) samples, and also show that Mykrobe gives concordant results with nanopore data.
We measure the ability of Mykrobe-based DST to guide personalized therapeutic regimen design in the context of complex drug susceptibility profiles, showing 94% concordance of implied regimen with that driven by phenotypic DST, higher than all other benchmarked tools.
BackgroundMycobacterium tuberculosis (MTB) whole genome sequencing data can provide insights into temporal and geographic trends in resistance acquisition and inform public health interventions.
Methods We curated a set of clinical MTB isolates with high quality sequencing and culture-based drug susceptibility data spanning four lineages and more than 20 countries. We constructed geographic and lineage specific MTB phylogenies and used Bayesian molecular dating to infer the most-recent-common-susceptible-ancestor age for 4,869 instances of resistance to 10 drugs.
Findings Of 8,550 isolates curated, 6,099 from 15 countries met criteria for molecular dating. The number of independent resistance acquisition events was lower than the number of resistant isolates across all countries, suggesting ongoing transmission of drug resistance. Ancestral age distributions supported the presence of old resistance, ≥20 years prior, in the majority of countries. A consistent order of resistance acquisition was observed globally starting with resistance to isoniazid, but resistance ancestral age varied by country. We found a direct correlation between country wealth and resistance age (R2= 0.47, P-value= 0.014). Amplification of fluoroquinolone and second-line injectable resistance among multidrug-resistant isolates is estimated to have occurred very recently (median ancestral age 4.7 years IQR 1.9-9.8 prior to sample collection). We found the sensitivity of commercial molecular diagnostics for second-line resistance to vary significantly by country (P-value <0.0003)
Interpretation Our results highlight that both resistance transmission and amplification are contributing to disease burden globally but are variable by country. The observation that wealthier nations are more likely to have old resistance suggests that programmatic improvements can reduce resistance amplification, but that fit resistant strains can circulate for decades subsequently.
Funding This work was supported by the NIH BD2K grant K01 ES026835, a Harvard Institute of Global Health Burke Fellowship (MF), Boston Children’s Hospital OFD/BTREC/CTREC Faculty Career Development Fellowship and Bushrod H. Campbell and Adah F. Hall Charity Fund/Charles A. King Trust Postdoctoral Fellowship (AD).
Evidence before this study Acquisition and spread of drug-resistance by Mycobacterium tuberculosis (MTB) varies across countries. Local factors driving evolution of drug resistance in MTB are not well studied.
Added value of this study We applied molecular dating to 6,099 global MTB patient isolates and found the order of resistance acquisition to be consistent across the countries examined, i.e. acquisition of isoniazid resistance first followed by rifampicin and streptomycin followed by resistance to other drugs. In all countries with data available there was evidence for transmission of resistant strains from patient-to-patient and in the majority for extended periods of time (>20 years).
Countries with lower gross wealth indices were found to have more recent resistance acquisition to the drug rifampicin. Based on the resistance patterns identified in our study we estimate that commercial diagnostic tests vary considerably in sensitivity for second-line resistance diagnosis by country.
Implications of all available evidence The longevity of resistant MTB in many parts of the world emphasizes its fitness for transmission and its continued threat to public health. The association between country wealth and recent resistance acquisition emphasizes the need for continued investment in TB care delivery and surveillance programs. Geographically relevant diagnostics that take into account a country’s unique distribution of resistance are necessary.
Resistance co-occurrence within first-line anti-tuberculosis (TB) drugs is a common phenomenon. Existing methods based on genetic data analysis of Mycobacterium tuberculosis (MTB) have been able to predict resistance of MTB to individual drugs, but have not considered the resistance co-occurrence and cannot capture latent structure of genomic data that corresponds to lineages.
We used a large cohort of TB patients from 16 countries across six continents where whole-genome sequences for each isolate and associated phenotype to anti-TB drugs were obtained using drug susceptibility testing recommended by the World Health Organization. We then proposed an end-to-end multi-task model with deep denoising auto-encoder (DeepAMR) for multiple drug classification and developed DeepAMR_cluster, a clustering variant based on DeepAMR, for learning clusters in latent space of the data. The results showed that DeepAMR outperformed baseline model and four machine learning models with mean AUROC from 94.4% to 98.7% for predicting resistance to four first-line drugs [i.e. isoniazid (INH), ethambutol (EMB), rifampicin (RIF), pyrazinamide (PZA)], multi-drug resistant TB (MDR-TB) and pan-susceptible TB (PANS-TB: MTB that is susceptible to all four first-line anti-TB drugs). In the case of INH, EMB, PZA and MDR-TB, DeepAMR achieved its best mean sensitivity of 94.3%, 91.5%, 87.3% and 96.3%, respectively. While in the case of RIF and PANS-TB, it generated 94.2% and 92.2% sensitivity, which were lower than baseline model by 0.7% and 1.9%, respectively. t-SNE visualization shows that DeepAMR_cluster captures lineage-related clusters in the latent space.
Bacteria and other microbes play a crucial role in human health and disease. Medicine and clinical microbiology have traditionally attempted to identify the etiological agents that causes disease, and how to eliminate them. Yet this traditional paradigm is becoming inadequate for dealing with a changing disease landscape. Major challenges to human health are noncommunicable chronic diseases, often driven by altered immunity and inflammation, and persistent communicable infections whose agents harbor antibiotic resistance. It is increasingly recognized that microbe-microbe interactions, as well as human-microbe interactions are important. Here, we review the "Evolutionary Medicine" framework to study how microbial communities influence human health. This approach aims to predict and manipulate microbial influences on human health by integrating ecology, evolutionary biology, microbiology, bioinformatics and clinical expertise. We focus on the potential promise of evolutionary medicine to address three key challenges: 1) detecting microbial transmission; 2) predicting antimicrobial resistance; 3) understanding microbe-microbe and human-microbe interactions in health and disease, in the context of the microbiome.
The diagnosis of multidrug resistant and extensively drug resistant tuberculosis is a global health priority. Whole genome sequencing of clinical Mycobacterium tuberculosis isolates promises to circumvent the long wait times and limited scope of conventional phenotypic drug susceptibility but gaps remain for predicting phenotype accurately from genotypic data. Using targeted or whole genome sequencing and conventional drug resistance phenotyping data from 3,601 Mycobacterium tuberculosis strains, 1,228 of which were multidrug resistant, we implemented the first multitask deep learning framework to predict phenotypic drug resistance to 10 anti-tubercular drugs. The proposed wide and deep neural network (WDNN) acheived improved predicted performance compared to regularized logistic regression and random forest: the average sensitivities and specificities, respectively, were 92.7% and 92.7% for first-line drugs and 82.0% and 92.8% for second-line drugs during cross-validation. On an independent validation set, the multitask WDNN showed significant performance gains over baseline models, with average sensitivities and specificities, respectively, of 84.5% and 93.6% for first-line drugs and 64.0% and 95.7% for second-line drugs. In addition to being able to learn from samples that have only been partially phenotyped, our proposed multitask architecture shares information across different anti-tubercular drugs and genes to provide a more accurate phenotypic prediction. We use t-distributed Stochastic Neighbor Embedding (t-SNE) visualization and feature importance analyses to examine inter-drug similarities. Deep learning has a clear role in improving drug resistance predictive performance over traditional methods and holds promise in bringing sequencing technologies closer to the bedside.
Background: Whole genome sequencing (WGS) can elucidates Mycobacterium tuberculosis (Mtb) transmission patterns but more data is needed to guide its use in high-burden settings. In a household-based transmissibility study of 4,000 TB patients in Lima, Peru, we identified a large MIRU-VNTR Mtb cluster with a range of resistance phenotypes and studied host and bacterial factors contributing to its spread.
Methods: WGS was performed on 61 of 148 isolates in the cluster. We compared transmission link inference using epidemiological or genomic data with and without the inclusion of controversial variants, and estimated the dates of emergence of the cluster and antimicrobial drug resistance acquisition events by generating a time-calibrated phylogeny. We validated our findings in genomic data from an outbreak of 325 TB cases in London. Using a larger set of 12,032 public Mtb genomes, we determined bacterial factors characterizing this cluster and under positive selection in other Mtb lineages.
Findings: Four isolates were distantly related and the remaining 57 isolates diverged ca. 1968 (95% HPD: 1945-1985). Isoniazid resistance arose once, whereas rifampicin resistance emerged subsequently at least three times. Amplification of other drug resistance occurred as recently as within the last year of sampling. High quality PE/PPE variants and indels added information for transmission inference. We identified five cluster-defining SNPs, including esxV S23L to be potentially contributing to transmissibility.
Interpretation: Clusters defined by MIRU-VNTR typing, could be circulating for decades in a high-burden setting. WGS allows for an improved understanding of transmission, as well as bacterial resistance and fitness factors.
Whole genome sequencing (WGS) can elucidate Mycobacterium tuberculosis (Mtb) transmission patterns but more data is needed to guide its use in high-burden settings. In a household-based TB transmissibility study in Peru, we identified a large MIRU-VNTR Mtb cluster (148 isolates) with a range of resistance phenotypes, and studied host and bacterial factors contributing to its spread. WGS was performed on 61 of the 148 isolates. We compared transmission link inference using epidemiological or genomic data and estimated the dates of emergence of the cluster and antimicrobial drug resistance (DR) acquisition events by generating a time-calibrated phylogeny. Using a set of 12,032 public Mtb genomes, we determined bacterial factors characterizing this cluster and under positive selection in other Mtb lineages. Four of the 61 isolates were distantly related and the remaining 57 isolates diverged ca. 1968 (95%HPD: 1945-1985). Isoniazid resistance arose once and rifampin resistance emerged subsequently at least three times. Emergence of other DR types occurred as recently as within the last year of sampling. We identified five cluster-defining SNPs potentially contributing to tranmissibility. In conclusion, clusters (as defined by MIRU-VNTR typing) may be circulating for decades in a high-burden setting. WGS allows for an enhanced understanding of transmission, drug resistance, and bacterial fitness factors.
Drug-resistant TB remains a public health challenge. Rifamycins are among the most potent anti-TB drugs. They are known to target the RpoB subunit of RNA polymerase; however, our understanding of how rifamycin resistance is genetically coded remains incomplete. Here we investigated rpoB genetic diversity and cross-resistance between the two rifamycin drugs rifampicin and rifabutin.
We performed WGS of 1003 Mycobacterium tuberculosis clinical isoltes and determined MICs of both rifamycin agents on 7H10 agar using the indirect proportion method. We generated rpoB mutants in a laboratory strain and measured their antibiotic susceptibility using the alamarBlue reduction assay.
Of the 1003 isolates, 766 were rifampicin resistant and 210 (27%) of these were ribabutin susceptible; j102/210 isolates had the rpoB mutation D435V (Escherichia coli D516V). Isolates with discordant resistance were 17.2 times more likely to harbour a D435V mutation than those resistant to both agents (OR 17.2, 95% CI 10.5-27.9, P value <10−40). Compared with WT, the D435V in vitro mutant had an increased IC50 of both rifamycins; however, in both cases to a lesser degree than the S450L (E. coli S531L) mutation.
The observation that the rpoB D435V mutation produces an increase in the IC50 of both drugs contrasts with findings from previous smaller studies that suggested that isolates with the D435V mutation remain rifabutin susceptible despite being rifampicin resistant. Our finding thus suggests that the recommended critical testing concentration for rifabutin should be revised.
Genome analysis should allow the discovery of interdependent loci that together cause antibiotic resistance. In practive, however, the vast number of possible epistatic interactions erodes statistical power. Here, we extend an approach that has been successfully used to identify epistatic residues in proteins to infer genomic loci that are strongly coupled. This approach reduces the number of tests required for an epistatic genome-wide association study of antibiotic resistance and increases the likelihood of identifying causal epistasis. We discovered 38 loci and 240 epistatic pairs that influence the minimum inhibitory concentrations of 5 different antibiotics in 1,102 isolates of Neisseria gonorrhoeae that were confirmed in a second dataset of 495 isolates. Many known resistance-affecting loci were recovered; however, the majority of associations occurred in unreported genes, such as murE. About half of the discovered epistasis involved at least one locus previously associated with antibiotic resistance, including interactions between gyrA and parC. Still, many combinations involved unreported loci and genes. While most variation in minimum inhibitory concentrations could be explained by identified loci, epistasis substantially increased explained phenotypic variance. Our work provides a systematic identification of epistasis affecting antibiotic resistance in N. gonorrhoeae and a generalizable approach for epistatic genome-wide association studies.
Meehan CJ, Serrano G.G., Kohl T, Verboven L, Dippenaar A, Ezewudo M, Farhat MR, Guthrie J, Laukens K, Miotto P, Ofori-Anyinam B, Dreyer V, Supply P, Suresh A, Utpatel C, D van Soolingen, Zhou Y, Ashton P, Brites D, Cabibbe A, Jong de B, De Vos M, Fabrizio M, Gagneux S, Gao Q, Heupink T, Liu Q, Louiseau CM, Rigouts L, Rodwell T, Tagliani E, Walker T, Warren R, Zhao Y, Zignol M, Schito M, Gardy JL, Cirillo D, Niemann S, Comas I, and Van Rie A. 2019. “Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues.” Nature Reviews Microbiology.