3D genome organization: principles and mechanisms

One of the major goals of our lab is to understand the principles and mechanisms of genome organization. The mammalian genome is spatially organized in the nucleus to enable cell type-specific gene expression. Investigating how chromatin architecture determines this specificity remains a big challenge due to the huge genome complexity: various kinds of chromatin regulators, different types of functional DNA elements, and sophisticated interactions between the two. We are developing high-throughput approaches – both experimental and computational – to discover functional DNA elements and novel genome regulators. Coupling with the traditional genomic, molecular, and biochemical approaches, we aim to understand the fundamental principles and molecular mechanisms of genome organization across the tree of life. We further investigate how the regulation of genome organization goes awry in diseases, and how we can manipulate, restore, and de novo design local chromatin organization to develop new therapeutics.


High-throughput in silico investigation of 3D genome organization

Investigating how 3D genome organization determines cell type-specific gene expression remains a challenge. The recent development of high-throughput technologies for measuring chromatin interaction frequencies, such as Hi-C and Micro-C, start to reveal the genome architecture of the genome. However, these methods are costly and bear strong technical limitations, restricting their widespread application, particularly when concerning high-throughput genetic perturbations. To address this challenge, we recently developed C.Origami, a deep neural network model that performs de novo prediction of cell type-specific chromatin architecture. The C.Origami model achieved optimal performance, thus enabling in silico experiments to examine the impact of genetic perturbations on chromatin interactions in cancer genomes and beyond. In addition, we propose an in silico genetic screening (ISGS) approach that enables high-throughput identification of impactful genomic elements and cell type-specific trans-acting regulators on 3D chromatin architecture. We are applying C.Origami model to systematically discover novel chromatin regulatory mechanisms in both normal and disease-related biological systems.


Genetic basis of tail-loss evolution in humans and apes

The loss of the tail is one of the main anatomical evolutionary changes to have occurred along the lineage leading to humans and to the “anthropomorphous apes”. This morphological reprogramming in the ancestral hominoids has been long considered to have accommodated a characteristic style of locomotion and contributed to the evolution of bipedalism in humans. Yet, the precise genetic mechanism that facilitated tail-loss evolution in hominoids remains unknown. We recently presented evidence that tail-loss evolution was mediated by the insertion of an individual Alu element into the genome of the hominoid ancestor. We further propose that the selection for the loss of the tail along the hominoid lineage was associated with an evolutionary trade-off which may continue to affect human health today.


Widespread transcriptional scanning modulates gene evolution rates

The testis expresses the largest number of genes of any mammalian organ, a finding that has long puzzled molecular biologists. We recently presented evidence that this widespread transcription maintains DNA sequence integrity in the male germline by correcting DNA damage through a mechanism we term transcriptional scanning. We find that genes expressed during spermatogenesis display lower mutation rates on the transcribed strand and have low diversity in the population. Moreover, this effect is fine-tuned by the level of gene expression during spermatogenesis. The unexpressed genes, which in our model do not benefit from transcriptional scanning, diverge faster over evolutionary time-scales and are enriched for sensory and immune-defense functions. Collectively, we propose that transcriptional scanning shapes germline mutation signatures and modulates mutation rates in a gene-specific manner, maintaining sequence integrity for the bulk of genes but allowing for faster evolution in a specific subset.


Chemical genomic technologies for analyzing DNA epigenetic modifications

Chemical modifications modulate various biological processes. DNA methylation (5mC) and TET-protein-mediated DNA methylation-derivatives (5hmC, 5fC and 5caC) represent one major part of epigenetics. The critical information to understand the function of an epigenetic factor is to profile its genome-wide distribution pattern. We have developed several unique and robust chemical methods to analyze the genome distribution map of 5fC and 5hmC. Represented by 'fC-CET' and 'CLEVER-seq', these methods demonstrated the concept of 'bisulfite-free & base-resolution' analysis of DNA epigenetic modifications. We used these methods to analyze the epigenomes in embryonic stem cells and even in single cells of the early developing embryo. These technologies will help us understand the molecular basis of epigenetic gene expression regulation and how these chemical modifications – and their modifier and reader proteins – affect mammalian development.