Publications by Type: Conference Paper

In Preparation
Tang C. (first author). In Preparation. “Evaluation of Hospital Rating Systems Through the Lens of Data Capital.” In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, TX, USA.Abstract
Publicly reported quality and safety rating systems represent promising innovations for rating hospital performance. Still, rating the raters via the data is needed for meaningful comparisons and to establish integrated oversight. Data is now a kind of capital, on par with social and human capital, affecting hospital services. Using the lens of data capital, these ranking systems’ content (e.g., data, metrics) can be combined into a composite to offer another perspective.
Usnws network Usnews_1_Method Manual Vizient_2_Method Manual CMS_3__Method Manual Leapfrog_4__Method Manual Truven_5__Method Manual
Tang C. (second author). 11/18/2019. “Data Reconstruction Based on Temporal Expressions in Clinical Notes.” In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pp. 1004-1008. San Diego, CA, USA: IEEE. Publisher's VersionAbstract

Learning representations of clinical notes poses challenges in handling complex content that necessitates preprocessing steps to make the data more suitable for data mining. An important issue, addressed here, is that of temporal expressions, where cues indicate the time when clinical events occur. We present a three-step data reconstruction algorithm for transforming similar clinical entities (e.g., symptoms, complications) into sequential data through unsupervised annotation of temporal expressions. First, the data reconstruction algorithm detects if an expression has temporal intent. Second, it decomposes and rewrites the expression into non-temporal sub-expression and temporal constraints. Finally, it clusters similar non-temporal sub-expressions by using unsupervised sentence embedding under the modified K-medoids paradigm. We experimented with our proposed algorithm on clinical notes associated with chronic obstructive pulmonary disease (COPD). Visualizing reconstruction results of cardiology reports for a longitudinal cohort of patients with COPD demonstrated that this algorithm is feasible.

data_reconstruction_algorithm.pptx b349.pdf
Tang C. (first author). 12/3/2018. “A Deep Learning Approach to Handling Temporal Variation in Chronic Obstructive Pulmonary Disease Progression.” In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pp. 502-509. Madrid, Spain: IEEE.Abstract
Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of mortality in the United States. Representing COPD progression using temporal graphs may offer critical clinical insights. Long-Short Term Memory units in recurrent neural networks can process data with constant elapsed times between consecutive elements of a sequence but cannot handle irregular time intervals (i.e., segments with unequal-time). In this study, we propose a four-layer deep learning model that utilizes a specially configured recurrent neural network to capture irregular time lapse segments. Experiments on a corpus of COPD patients’ clinical notes compared to baseline algorithms showed that our model improved interpretability as well as the accuracy of estimating COPD progression.
Illustration of all three types of clinical notes in COPD patient (Fig. 4@Tableau).
B295.pdf nsf_award.jpg
Tang C. (coauthor). 12/3/2018. “Predicting Disease-Related Associations by Heterogeneous Network Embedding.” In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pp. 548-555. Madrid, Spain: IEEE.Abstract
Elucidating biological mechanisms underlying complex diseases is an important goal in biomedical research. Recent advances in biological technology have enabled the generation of massive volume of data in genomics, transcriptomics, proteomics, epigenomics, metagenomics, metabolomics, nutriomics, etc., leading to the emergence of systems biology approach to investigating complex diseases. However, most of the data remain underutilized after their initial acquisition and analysis. There is a growing gap between the generation of the multifaceted data and our ability to integrate and analyze them. Inspired by the observation that many of the aforementioned data can be represented by networks, we propose a networkbased model to encapsulate the rich information provided in each database and to connect across different databases. We integrate several public databases to construct a heterogeneous network in which nodes are entities such as genes, miRNAs, diseases, and edges represent known relationships between them. One fundamental challenge is how to perform meaningful analysis on such network, overcoming the intrinsic heterogeneity. We propose a network embedding method to learn a low-dimensional vector space that best preserves the known relationships between entities. Based on the learned vector representations, entities that are close to each other but currently do not have known direct connections, are likely to have an association and therefore are good candidates for future investigation. In the experiments, we construct a heterogeneous network of genes, miRNAs and diseases using data from six public databases. To evaluate the performance of the proposed method, we predict disease-gene and disease-miRNA associations. Comparison of our novel method with several state-of-the-art methods clearly demonstrates the advantage of our method, as it is the only one that takes full advantage of the rich contextual information provided by the heterogeneous network. The encouraging results suggest that our method can provide help in identifying new hypotheses to guide future research. 
Tang C. (first author). 11/2017. “Developing a Regional Classifier to Track Patient Needs in Medical Literature Using Spiral Timelines on a Geographical Map.” In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Pp. 874-879. Kansas City, MO, USA: IEEE. Publisher's VersionAbstract
Research clues can be expressed as coherent chains of keywords grouped by theme. Capturing clues to research from the vast and expanding medical literature is valuable. Yet, it is difficult to automatically create clear visualizations of research clues despite the presence of many competing summarization tools. In this paper, we propose a linear classifier based on a spiral, which we call a regional classifier. The study emphasizes the development of visualization methods and the process of finding a specific research clue to track patient needs reported in medical literature. When timelines are combined with a spiral geographical map, they show a geometric shape that helps to reveal the clues from different spatial viewpoints and periodical constraints. Our evaluation showed that the regional classifier produces better visual effects than support vector machine classifiers. It covers important concepts of each theme and is able to represent the relationships among papers in a way that captures continuous developments and changes in key themes.