We developed a Python package to help in performing drug-response experiments at medium and high throughput and evaluating sensitivity metrics from the resulting data. In this article, we describe the steps involved in (1) generating files necessary for treating cells with the HP D300 drug dispenser, by pin transfer or by manual pipetting; (2) merging the data generated by high-throughput slide scanners, such as the Perkin Elmer Operetta, with treatment annotations; and (3) analyzing the results to obtain data normalized to untreated controls and sensitivity metrics such as IC50 or GR50. These modules are available on GitHub and provide an automated pipeline for the design and analysis of high-throughput drug response experiments, that helps to prevent errors that can arise from manually processing large data files.
Measuring the potencies of small-molecule drugs in cell lines is a critical aspect of preclinical pharmacology. Such experiments are also prototypical of high-throughput experiments in multi-well plates. The procedure is simple in principle, but many unrecognized factors can affect the results, potentially making data unreliable. The procedures for measuring drug response described here were developed by the NIH LINCS program to improve reproducibility. Key features include maximizing uniform cell growth during the assay period, accounting for the effects of cell density on response, and correcting sensitivity measures for differences in proliferation rates. Two related protocols are described: one involves an endpoint measure well-suited to large-scale studies and the second is a time-dependent measurement that reveals changes in response over time. The methods can be adapted to other types of plate-based experiments.
Measuring drug response in cancer cell lines is essential for studying mechanisms of drug action and identifying genetic variants associated with sensitivity and resistance (preclinical pharmacogenomics). The conventional approach involves computing relative viability based on the ratio of cells in control and drug-treated cultures at the end of a fixed time period. We have recently shown that relative viability and the response metrics derived from it, such as IC50 (drug concentration resulting in 50% relative viability) and area under the dose–response curve (AUC), are confounded by variation in cell proliferation rates and assay duration, potentially explaining discrepancies among large-scale drug response data sets. Here we show that use of relative viability as a measure of drug response results in false-positive and false-negative pharmacogenomic associations. An alternative approach, based on computing normalized growth rate (GR) inhibition minimizes these artifactual associations by scoring drug sensitivity on a per-division basis. GR50, a measure of potency, is the concentration of drug that reduces cell proliferation rate by one-half; GRmax, a measure of efficacy, is the maximum effect of a drug at the highest tested concentration; and GRAOC combines these in an integrated 'area over the curve' value. The signs of GR values and GRmax relate directly to response phenotype: positive for partial growth inhibition, zero for complete cytostasis and negative for cell death. By contrast, Emax (relative viability at the highest tested concentration) values do not have this property as they are strongly confounded by proliferation rates. Collecting GR values requires only modest changes in experimental approach and calculations can be performed online (http://www.GRcalculator.org/).
Cancer systems biology aims to understand cancer as an integrated system of genes, proteins, networks, and interactions rather than an entity of isolated molecular and cellular components. The inaugural Systems Approaches to Cancer Biology Conference, cosponsored by the Association of Early Career Cancer Systems Biologists and the National Cancer Institute of the NIH, focused on the interdisciplinary field of cancer systems biology and the challenging cancer questions that are best addressed through the combination of experimental and computational analyses. Attendees found that elucidating the many molecular features of cancer inevitably reveals new forms of complexity and concluded that ensuring the reproducibility and impact of cancer systems biology studies will require widespread method and data sharing and, ultimately, the translation of important findings to the clinic.
The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.
Drug sensitivity and resistance are conventionally quantified by IC50 or Emax values, but these metrics are highly sensitive to the number of divisions taking place over the course of a response assay. The dependency of IC50 and Emax on division rate creates artefactual correlations between genotype and drug sensitivity, while obscuring valuable biological insights and interfering with biomarker discovery. We derive alternative small molecule drug-response metrics that are insensitive to division number. These are based on estimation of the magnitude of drug-induced growth rate inhibition (GR) using endpoint or time-course assays. We show that GR50 and GRmaxare superior to conventional metrics for assessing the effects of small molecule drugs in dividing cells. Moreover, adopting GR metrics requires only modest changes in experimental protocols. We expect GR metrics to improve the study of cell signaling and growth using small molecules and biologics and to facilitate the discovery of drug-response biomarkers and the identification of drugs effective against specific patient-derived tumor cells.
Nephrotoxicity due to drugs and environmental chemicals accounts for significant patient mortality and morbidity, but there is no high throughput in vitro method for predictive nephrotoxicity assessment. We show that primary human proximal tubular epithelial cells (HPTECs) possess characteristics of differentiated epithelial cells rendering them desirable to use in such in vitro systems. To identify a reliable biomarker of nephrotoxicity, we conducted multiplexed gene expression profiling of HPTECs after exposure to six different concentrations of nine human nephrotoxicants. Only overexpression of the gene encoding heme oxygenase-1 (HO-1) significantly correlated with increasing dose for six of the compounds, and significant HO-1 protein deregulation was confirmed with each of the nine nephrotoxicants. Translatability of HO-1 increase across species and platforms was demonstrated by computationally mining two large rat toxicogenomic databases for kidney tubular toxicity and by observing a significant increase in HO-1 after toxicity using an ex vivo three-dimensional microphysiologic system (kidney-on-a-chip). The predictive potential of HO-1 was tested using an additional panel of 39 mechanistically distinct nephrotoxic compounds. Although HO-1 performed better (area under the curve receiver-operator characteristic curve [AUC-ROC]=0.89) than traditional endpoints of cell viability (AUC-ROC for ATP=0.78; AUC-ROC for cell count=0.88), the combination of HO-1 and cell count further improved the predictive ability (AUC-ROC=0.92). We also developed and optimized a homogenous time-resolved fluorescence assay to allow high throughput quantitative screening of nephrotoxic compounds using HO-1 as a sensitive biomarker. This cell-based approach may facilitate rapid assessment of potential nephrotoxic therapeutics and environmental chemicals.
When cells are exposed to death ligands such as TRAIL, a fraction undergoes apoptosis and a fraction survives; if surviving cells are re‐exposed to TRAIL, fractional killing is once again observed. Therapeutic antibodies directed against TRAIL receptors also cause fractional killing, even at saturating concentrations, limiting their effectiveness. Fractional killing arises from cell‐to‐cell fluctuations in protein levels (extrinsic noise), but how this results in a clean bifurcation between life and death remains unclear. In this paper, we identify a threshold in the rate and timing of initiator caspase activation that distinguishes cells that live from those that die; by mapping this threshold, we can predict fractional killing of cells exposed to natural and synthetic agonists alone or in combination with sensitizing drugs such as bortezomib. A phenomenological model of the threshold also quantifies the contributions of two resistance genes (c‐FLIP and Bcl‐2), providing new insight into the control of cell fate by opposing pro‐death and pro‐survival proteins and suggesting new criteria for evaluating the efficacy of therapeutic TRAIL receptor agonists.
For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology. The LCB application is available at http://www.maayanlab.net/LINCS/LCB. Customized versions will be made part of the http://lincscloud.org and http://lincs.hms.harvard.edu websites.
BACKGROUND: Soluble growth factors present in the microenvironment play a major role in tumor development, invasion, metastasis, and responsiveness to targeted therapies. While the biochemistry of growth factor-dependent signal transduction has been studied extensively in individual cell types, relatively little systematic data are available across genetically diverse cell lines. RESULTS: We describe a quantitative and comparative dataset focused on immediate-early signaling that regulates the AKT (AKT1/2/3) and ERK (MAPK1/3) pathways in a canonical panel of well-characterized breast cancer lines. We also provide interactive web-based tools to facilitate follow-on analysis of the data. Our findings show that breast cancers are diverse with respect to ligand sensitivity and signaling biochemistry. Surprisingly, triple negative breast cancers (TNBCs; which express low levels of ErbB2, progesterone and estrogen receptors) are the most broadly responsive to growth factors and HER2amp cancers (which overexpress ErbB2) the least. The ratio of ERK to AKT activation varies with ligand and subtype, with a systematic bias in favor of ERK in hormone receptor positive (HR+) cells. The factors that correlate with growth factor responsiveness depend on whether fold-change or absolute activity is considered the key biological variable, and they differ between ERK and AKT pathways. CONCLUSIONS: Responses to growth factors are highly diverse across breast cancer cell lines, even within the same subtype. A simple four-part heuristic suggests that diversity arises from variation in receptor abundance, an ERK/AKT bias that depends on ligand identity, a set of factors common to all receptors that varies in abundance or activity with cell line, and an "indirect negative regulation" by ErbB2. This analysis sets the stage for the development of a mechanistic and predictive model of growth factor signaling in diverse cancer lines. Interactive tools for looking up these results and downloading raw data are available at http://lincs.hms.harvard.edu/niepel-bmcbiol-2014/.
With recent improvements of protocols for the assembly of transcriptional parts, synthetic biological devices can now more reliably be assembled according to a given design. The standardization of parts open up the way for in silico design tools that improve the construct and optimize devices with respect to given formal design specifications. The simplest such optimization is the selection of kinetic parameters and protein abundances such that the specified design constraints are robustly satisfied. In this work we address the problem of determining parameter values that fulfill specifications expressed in terms of a functional on the trajectories of a dynamical model. We solve this inverse problem by linearizing the forward operator that maps parameter sets to specifications, and then inverting it locally. This approach has two advantages over brute-force random sampling. First, the linearization approach allows us to map back intervals instead of points and second, every obtained value in the parameter region is satisfying the specifications by construction. The method is general and can hence be incorporated in a pipeline for the rational forward design of arbitrary devices in synthetic biology.
Identifying factors responsible for variation in drug response is essential for the effective use of targeted therapeutics. We profiled signaling pathway activity in a collection of breast cancer cell lines before and after stimulation with physiologically relevant ligands, which revealed the variability in network activity among cells of known genotype and molecular subtype. Despite the receptor-based classification of breast cancer subtypes, we found that the abundance and activity of signaling proteins in unstimulated cells (basal profile), as well as the activity of proteins in stimulated cells (signaling profile), varied within each subtype. Using a partial least-squares regression approach, we constructed models that significantly predicted sensitivity to 23 targeted therapeutics. For example, one model showed that the response to the growth factor receptor ligand heregulin effectively predicted the sensitivity of cells to drugs targeting the cell survival pathway mediated by PI3K (phosphoinositide 3-kinase) and Akt, whereas the abundance of Akt or the mutational status of the enzymes in the pathway did not. Thus, basal and signaling protein profiles may yield new biomarkers of drug sensitivity and enable the identification of appropriate therapies in cancers characterized by similar functional dysregulation of signaling networks.
In mammals, the suprachiasmatic nucleus (SCN) of the hypothalamus constitutes the central circadian pacemaker. The SCN receives light signals from the retina and controls peripheral circadian clocks (located in the cortex, the pineal gland, the liver, the kidney, the heart, etc.). This hierarchical organization of the circadian system ensures the proper timing of physiological processes. In each SCN neuron, interconnected transcriptional and translational feedback loops enable the circadian expression of the clock genes. Although all the neurons have the same genotype, the oscillations of individual cells are highly heterogeneous in dispersed cell culture: many cells present damped oscillations and the period of the oscillations varies from cell to cell. In addition, the neurotransmitters that ensure the intercellular coupling, and thereby the synchronization of the cellular rhythms, differ between the two main regions of the SCN. In this work, a mathematical model that accounts for this heterogeneous organization of the SCN is presented and used to study the implication of the SCN network topology on synchronization and entrainment properties. The results show that oscillations with larger amplitude can be obtained with scale-free networks, in contrast to random and local connections. Networks with the small-world property such as the scale-free networks used in this work can adapt faster to a delay or advance in the light/dark cycle (jet lag). Interestingly a certain level of cellular heterogeneity is not detrimental to synchronization performances, but on the contrary helps resynchronization after jet lag. When coupling two networks with different topologies that mimic the two regions of the SCN, efficient filtering of pulse-like perturbations in the entrainment pattern is observed. These results suggest that the complex and heterogeneous architecture of the SCN decreases the sensitivity of the network to short entrainment perturbations while, at the same time, improving its adaptation abilities to long term changes.
The existence of a length threshold, of about 35 residues, above which polyglutamine repeats can give rise to aggregation and to pathologies, is one of the hallmarks of polyglutamine neurodegenerative diseases such as Huntington's disease. The reason why such a minimal length exists at all has remained one of the main open issues in research on the molecular origins of such classes of diseases. Following the seminal proposals of Perutz, most research has focused on the hunt for a special structure, attainable only above the minimal length, able to trigger aggregation. Such a structure has remained elusive and there is growing evidence that it might not exist at all. Here we review some basic polymer and statistical physics facts and show that the existence of a threshold is compatible with the modulation that the repeat length imposes on the association and dissociation rates of polyglutamine polypeptides to and from oligomers. In particular, their dramatically different functional dependence on the length rationalizes the very presence of a threshold and hints at the cellular processes that might be at play, in vivo, to prevent aggregation and the consequent onset of the disease.
Synthetic biology efforts have largely focused on small engineered gene networks, yet understanding how to integrate multiple synthetic modules and interface them with endogenous pathways remains a challenge. Here we present the design, system integration, and analysis of several large scale synthetic gene circuits for artificial tissue homeostasis. Diabetes therapy represents a possible application for engineered homeostasis, where genetically programmed stem cells maintain a steady population of β-cells despite continuous turnover. We develop a new iterative process that incorporates modular design principles with hierarchical performance optimization targeted for environments with uncertainty and incomplete information. We employ theoretical analysis and computational simulations of multicellular reaction/diffusion models to design and understand system behavior, and find that certain features often associated with robustness (e.g., multicellular synchronization and noise attenuation) are actually detrimental for tissue homeostasis. We overcome these problems by engineering a new class of genetic modules for 'synthetic cellular heterogeneity' that function to generate beneficial population diversity. We design two such modules (an asynchronous genetic oscillator and a signaling throttle mechanism), demonstrate their capacity for enhancing robust control, and provide guidance for experimental implementation with various computational techniques. We found that designing modules for synthetic heterogeneity can be complex, and in general requires a framework for non-linear and multifactorial analysis. Consequently, we adapt a 'phenotypic sensitivity analysis' method to determine how functional module behaviors combine to achieve optimal system performance. We ultimately combine this analysis with Bayesian network inference to extract critical, causal relationships between a module's biochemical rate-constants, its high level functional behavior in isolation, and its impact on overall system performance once integrated.
BACKGROUND: A biological system's robustness to mutations and its evolution are influenced by the structure of its viable space, the region of its space of biochemical parameters where it can exert its function. In systems with a large number of biochemical parameters, viable regions with potentially complex geometries fill a tiny fraction of the whole parameter space. This hampers explorations of the viable space based on "brute force" or Gaussian sampling.
RESULTS: We here propose a novel algorithm to characterize viable spaces efficiently. The algorithm combines global and local explorations of a parameter space. The global exploration involves an out-of-equilibrium adaptive Metropolis Monte Carlo method aimed at identifying poorly connected viable regions. The local exploration then samples these regions in detail by a method we call multiple ellipsoid-based sampling. Our algorithm explores efficiently nonconvex and poorly connected viable regions of different test-problems. Most importantly, its computational effort scales linearly with the number of dimensions, in contrast to "brute force" sampling that shows an exponential dependence on the number of dimensions. We also apply this algorithm to a simplified model of a biochemical oscillator with positive and negative feedback loops. A detailed characterization of the model's viable space captures well known structural properties of circadian oscillators. Concretely, we find that model topologies with an essential negative feedback loop and a nonessential positive feedback loop provide the most robust fixed period oscillations. Moreover, the connectedness of the model's viable space suggests that biochemical oscillators with varying topologies can evolve from one another.
CONCLUSIONS: Our algorithm permits an efficient analysis of high-dimensional, nonconvex, and poorly connected viable spaces characteristic of complex biological circuitry. It allows a systematic use of robustness as a tool for model discrimination.
On top of the many external perturbations, cellular oscillators also face intrinsic perturbations due the randomness of chemical kinetics. Biomolecular oscillators, distinct in their parameter sets or distinct in their architecture, show different resilience with respect to such intrinsic perturbations. Assessing this resilience can be done by ensemble stochastic simulations. These are computationally costly and do not permit further insights into the mechanistic cause of the observed resilience. For reaction systems operating at a steady state, the linear noise approximation (LNA) can be used to determine the effect of molecular noise. Here we show that methods based on LNA fail for oscillatory systems and we propose an alternative ansatz. It yields an asymptotic expression for the phase diffusion coefficient of stochastic oscillators. Moreover, it allows us to single out the noise contribution of every reaction in an oscillatory system. We test the approach on the one-loop model of the Drosophila circadian clock. Our results are consistent with those obtained through stochastic simulations with a gain in computational efficiency of about three orders of magnitude.
To characterize the behavior and robustness of cellular circuits with many unknown parameters is a major challenge for systems biology. Its difficulty rises exponentially with the number of circuit components. We here propose a novel analysis method to meet this challenge. Our method identifies the region of a high-dimensional parameter space where a circuit displays an experimentally observed behavior. It does so via a Monte Carlo approach guided by principal component analysis, in order to allow efficient sampling of this space. This 'global' analysis is then supplemented by a 'local' analysis, in which circuit robustness is determined for each of the thousands of parameter sets sampled in the global analysis. We apply this method to two prominent, recent models of the cyanobacterial circadian oscillator, an autocatalytic model, and a model centered on consecutive phosphorylation at two sites of the KaiC protein, a key circadian regulator. For these models, we find that the two-sites architecture is much more robust than the autocatalytic one, both globally and locally, based on five different quantifiers of robustness, including robustness to parameter perturbations and to molecular noise. Our 'glocal' combination of global and local analyses can also identify key causes of high or low robustness. In doing so, our approach helps to unravel the architectural origin of robust circuit behavior. Complementarily, identifying fragile aspects of system behavior can aid in designing perturbation experiments that may discriminate between competing mechanisms and different parameter sets.