Genes involved in host-pathogen interactions are among the most rapidly evolving genes in the genomes of most organisms, and thus provide a valuable model for understanding how strong selection pressures shape interacting networks of functionally interdependent genes. My work to date has demonstrated that not only are proteins involved in the immune response of Drosophila melanogaster rapidly evolving, but that both gene duplications and the acquisition of taxonomically-restricted novel genes are important forces driving the evolution of innate immune systems not just in Drosophila, but other Dipterans such as Musca domestica and other insects such as Nasonia vitripennis. This work thus suggests that commonalities may exist in the evolutionary response to pathogens across diverse organisms, implying a degree of predictability in evolution.
In the context of complex gene networks such as the immune system, acquisition and loss of genes is a particular challenge, as new genes must integrate into the existing network in order to be appropriately regulated. Taxonomically-restricted genes – that is, evolutionary orphans with identifiable homologs in only one or a few species – are surprisingly common in both recognition and effector components of the immune network, but are hard to functionally annotate or analyze evolutionarily. The advent of RNA-seq technology provides a method for the unbiased characterization of genes regulated by infection in almost any species that can be manipulated in the laboratory, which allows for the first time a genome-wide screen to identify all genes regulated by a specific immune challenge independent of homology-basedannotations. I pioneered this approach to immune system annotation, using a combination of RNA-seq and high-throughput quantitative proteomics to characterize the infection-induced transcriptome and proteome of Drosophila virilis, and infection-induced transcriptome of Nasonia vitripennis and Musca domestica. These studies revealed a significant enrichment of taxonomically-restricted genes in the infection-inducible transcriptome and suggest that novel genes may be readily recruited to inducible networks, as shown to the right forNasonia. This figure shows the proportion of genes induced by infection, not regulated, or repressed by infection with either no homologs outside of wasps (Wasp class), or to increasing more divergent lineages.
My current work involves a large scale screen to sample the immune-induced transcriptome in 12 species across the Drosophila phylogeny, in collaboration with Andy Clark at Cornell. These data will allow extensive characterization of the prevalence of taxonomically-restricted infection-regulated genes across Drosophilids, and provide sufficient phylogenetic breadth to allow inference about the evolutionary forces shaping their appearance and persistence. This study is one of the first large-scale transcriptomic studies explicitly designed in a phylogenetic context, and will be fertile ground for developing phylogenetically-aware methods for sequencing-based annotation and interpretation of differential expression.
Genomics of convergent evolution
Convergent phenotypes, in which organisms that do not share an immediate common ancestor evolve towards the same phenotypic state, are often taken to be the ultimate signature of adaptive evolution, as phenotypic convergence is assumed to be exceedingly rare in the absence of selection. A crucial unresolved question, however, is the extent to which convergent phenotypes arise from parallel genetic mechanisms at the level of nucleotides, functional elements, or coordinated functional or developmental systems.
A particularly fascinating example of this is found in the large flightless birds, the ratites, which include ostriches, emu, cassowaries, rheas, and kiwis. All these birds are flightless and some have severely reduced wings. Surprisingly, however, the volant (flighted) tinamous are phylogenetically nested within the flightless ratites, implying that flight was lost multiple times independently during the evolution of the ratites (or that the ancestor of tinamous regained flight from a flightless ancestor).
My current work involves sequencing the genomes of most of the extant ratites, as part of a large collaborative NSF-funded project led by Scott Edwards and Michele Clamp at Harvard and Julia Clark at the University of Texas at Austin. With high quality assembled genomes in hand, I will develop computational methods to detect genomic regions convergently gained or lost along the lineages leading to flightless ratites, but not along the lineage leading to the tinamous.
A major challenge in genomics is understanding, at a global level, the differential impacts of selection across the genome and ultimately how much genomic change both within and between species is driven by positive selection of beneficial, adaptive mutations.
One approach to assessing the degree to which selection has impacted a region of the genome is to look at patterns of linked neutral variation. Because nearby sites in the genome do not evolution independently, if the action of natural selection influences the trajectory of a non-neutral mutation in the genome, it will also influence the trajectory of linked neutral mutations, leading to detectable signatures in diversity data. Since the degree to which neutral diversity is impacted by selection depends on the local recombination rate, the action of natural selection will lead to a correlation between neutral diversity and recombination rate, as first reported for Drosophila melanogaster by Dave Begun and Chip Aquadro in 1992.
In my work, I have developed a novel framework to use the relationship between recombination rate and neutral diversity to infer the strength of selection acting on a genome, and used it to assess the impact of natural selection on the genomes of 38 multicellular eukaryotes. This work shows that natural selection has a substantially greater impact on neutral diversity in species with large population sizes; for many abundant species, natural selection is pervasive, meaning that the trajectory of neutral alleles is largely determined by the behavior of linked, selected alleles.