Last updated on 08/22/2020
Tang C. (coauthor). 12/23/2019. “Heterogeneous network embedding enabling accurate disease association predictions.” BMC Medical Genomics, 12, Suppl 10, Pp. 186. Publisher's Version
Background It's significant to elucidate complex biological mechanisms of various diseases in biomedical research. Recently, the growing generation of massive volume of data in genomics, epigenomics, metagenomics, proteomics, metabolomics, nutriomics, etc., has resulted in the rise of systematic biological means of exploring complex diseases. However, the gap between the generation of the multiple data and our ability to analyze them has been broaden gradually. Furthermore, we observe that many of the aforementioned data can be represented by networks, and founded on the vector representations learned by network embedding methods, entities that are close to each other but at present do not have known direct links have high potential to be related and therefore are good candidate subjects for future biological research.
Results We integrate six public databases to construct a heterogeneous network containing three types of entities (i.e., genes, miRNAs, disease). To tackle the inherent heterogeneity, we propose a network embedding method to learn a low-dimensional vector space which best preserves the relationships between conduct disease-gene and disease-miRNA associations predictions, results of which show the superiority of our novel method over several state-of-the-arts. Furthermore, many associations predicted by our method are verified in the latest real-world dataset.
Conclusions We propose a novel heterogeneous network embedding method which can make full use of the rich contextual information and structures of heterogeneous network. We further demonstrate the effectiveness of our method in directing biological experiments, which can assist in identifying new hypotheses in biological investigation.