Publications

Submitted
Peter Kerpedjiev, Nezar Abdennur, Fritz Lekschas, Chuck McCallum, Kasper Dinkla, Hendrik Strobelt, Jacob M Luber, Scott Ouellette, Alaleh Azhir, Nikhil Kumar, Jeewon Hwang, Burak H. Alver, Hanspeter Pfister, Leonid A Mirny, Peter J. Park, and Nils Gehlenborg. Submitted. “HiGlass: Web-based Visual Comparison And Exploration Of Genome Interaction Maps.” bioRxiv, 121889. Publisher's VersionAbstract

 

We present HiGlass (http://higlass.io), a web-based viewer for genome interaction maps featuring synchronized navigation of multiple views as well as continuous zooming and panning for navigation across genomic loci and resolutions. We demonstrate how visual comparison of Hi-C and other genomic data from different experimental conditions can be used to efficiently identify salient outcomes of experimental perturbations, generate new hypotheses, and share the results with the community.

 

2017
Gut Microbiota: Small Molecules Modulate Host Cellular Functions
Jacob M. Luber and Aleksandar D. Kostic. 4/24/2017. “Gut Microbiota: Small Molecules Modulate Host Cellular Functions.” Current Biology , 27, 8, Pp. R307-R310. Publisher's VersionAbstract

The human gut metagenome was recently discovered to encode vast collections of biosynthetic gene clusters with diverse chemical potential, almost none of which are yet functionally validated. Recent work elucidates common microbiome-derived biosynthetic gene clusters encoding peptide aldehydes that inhibit human proteases.

Jacob M. Luber, Braden T. Tierney, Evan M. Cofer, Chirag J. Patel, and Aleksandar D. Kostic. 12/8/2017. “Aether: Leveraging Linear Programming For Optimal Cloud Computing In Genomics.” Bioinformatics, btx787. Publisher's VersionAbstract

Motivation

Across biology we are seeing rapid developments in scale of data production without a corresponding increase in data analysis capabilities.

Results

Here, we present Aether (http://aether.kosticlab.org), an intuitive, easy-to-use, cost-effective, and scalable framework that uses linear programming (LP) to optimally bid on and deploy combinations of underutilized cloud computing resources. Our approach simultaneously minimizes the cost of data analysis and provides an easy transition from users’ existing HPC pipelines.

Availability

Data utilized are available at https://pubs.broadinstitute.org/diabimmune and with EBI SRA accession ERP005989. Source code is available at (https://github.com/kosticlab/aether). Examples, documentation, and a tutorial are available at (http://aether.kosticlab.org).

Contact

chirag_patel@hms.harvard.edu and aleksandar.kostic@joslin.harvard.edu

btx787.pdf
Job Dekker, Andrew S. Belmont, Mitchell Guttman, Victor O. Leshyk, John T. Lis, Stavros Lomvardas, Leonid A. Mirny, Clodagh C. O’Shea, Peter J. Park, Bing Ren, Joan C. Ritland Politz, Jay Shendure, Sheng Zhong, and The Nucleome 4D Network. 9/2017. “The 4D nucleome project.” Nature, 549, 7671, Pp. 219-226. Publisher's VersionAbstract
The 4D Nucleome Network aims to develop and apply approaches to map the structure and dynamics of the human and mouse genomes in space and time with the goal of gaining deeper mechanistic insights into how the nucleus is organized and functions. The project will develop and benchmark experimental and computational approaches for measuring genome conformation and nuclear organization, and investigate how these contribute to gene regulation and other genome functions. Validated experimental technologies will be combined with biophysical approaches to generate quantitative models of spatial genome organization in different biological states, both in cell populations and in single cells.
Chao Fang, Huanzi Zhong, Yuxiang Lin, Bin Chen, Mo Han, Huahui Ren, Haorong Lu, Jacob Mayne Luber, Min Xia, Wangsheng Li, Shayna Stein, Xun Xu, Wenwei Zhang, Radoje Drmanac, Jian Wang, Huanming Yang, Lennart Hammarström, Aleksandar David Kostic, Karsten Kristiansen, and Junhua Li. 2017. “Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.” Gigascience.Abstract
Background: More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of two Illumina platforms. Findings: Using fecal samples from 20 healthy individuals we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency, was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform versus the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the two Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high quality reads (96.06% of raw reads) per sample with 90.56% of bases scoring Q30 and above was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02% -3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias towards genes with higher GC content being enriched on the HiSeq platforms. Conclusion: Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.
2016
Jacob M Luber. 2016. “Improved Prediction of Mouse Pathways Related to Bone Maintenance Through Machine Learning Utilizing Diverse Genomic Data.” Trinity University Computer Science Honors Undergraduate Thesis.Abstract

The genetic cause of osteoporosis is poorly understood, but a wealth of functional genomic data exist from which osteoporosis related pathways could be identified. A machine learning pipeline was created using Support Vector Machines and was first applied using as inputs all available gene expression data and a second time using only bone-related data. In both cases, models were trained using a manually curated training set of gene relationships known to support bone maintenance and development. Each model was used to predict novel pairwise gene relationships, and specific pathways were compared between models to identify relationships supported primarily by data collected in bone-related contexts as opposed to other cellular contexts. Our results indicate a more accurate result was achieved through biologically-motivated feature selection that considers mammalian cellular context. Our results reinforce the observation that if two genes are functionally associated in one context they may not be functionally associated in all contexts, necessitating careful consideration of training sets and input data into functional prediction methods. 

2015
Jacob M. Luber, Joel Graber, and Carol J. Bult. 8/15/2015. “Identifying genome signatures of drug response in patient derived xenografts (PDX): a machine learning approach.” JAX Summer and Academic Year Student Reports, Paper 2508.