Combinatorial code analysis for understanding biological regulation

PhD_Dissertation.pdf1.77 MB

Abstract:

An important mechanism to achieve regulatory specificity in diverse biological processes is through the combinatorial interplay between different regulators, such as amongst transcription factors (TFs) during transcriptional regulation or between RNA binding proteins (RBPs) and microRNAs (miRNAs) during transcript degradation control. To advance our understanding of combinatorial regulation, we developed a computational pipeline called CCAT (Combinatorial Code Analysis Tool) for predicting genome-wide co-binding between biological regulators.

In the first part of this thesis, we applied CCAT to the D. melanogaster genome to uncover cooperativity amongst TFs during embryo development. Using publicly available TF binding specificity data and DNaseI chromatin accessibility data, we first predicted genome-wide binding sites for 324 TFs across five stages of D. melanogaster embryo development. We then applied CCAT in each of these developmental stages, and identified from 20 to 60 pairs of TFs in each stage whose predicted binding sites are significantly co-localized. Several of the co-binding pairs we found correspond to TFs that are known to work together. Further, pairs of binding sites predicted to cooperate were found to be consistently enriched in their evolutionarily conservation and their tendency to be found in regions bound in relevant ChIP experiments. Finally, we found that TFs tend to be co-localized with other TFs in a dynamic manner across developmental stages.

In the second part of this thesis, we applied CCAT to explore whether RBPs and miRNAs cooperate to promote transcript decay. We concentrated on five highly conserved RBP motifs in human 3'UTRs. A specific group of miRNA recognition sites were enriched within 50 nts from the RBP recognition sites for PUM and UAUUUAU. The presence of both a PUM recognition site and a recognition site for preferentially co-occurring miRNAs was associated with faster decay of the associated transcripts. For PUM and its co-occurring miRNAs, binding of the RBP to its recognition sites was predicted to release nearby miRNA recognition sites from RNA secondary structures. Overall, our CCAT analyses suggest that a specific set of RBPs and miRNAs work together to affect transcript decay, with the release of miRNA recognition sites via RBP binding as one possible model of cooperativity.

Our pipeline provides a general tool for identifying combinatorial cooperativity in biological regulation. All generated data as well as source code are available at: http://cat.princeton.edu.

Last updated on 12/24/2017