Brief Bio

Dr. Guergana Savova is Associate Professor at Harvard Medical School and Computational Health Informatics Program (CHIP; chip.org) at Boston Children’s Hospital. Her research interests are in natural language processing (NLP) and information extraction especially as applied to the text generated by physicians (the clinical narrative).  Dr. Savova has been creating gold standard annotated resources based on computable definitions and developing methods for computable solutions. The focus of Dr. Savova's research is higher level semantic and discourse processing of the clinical narrative which includes tasks such as named entity recognition, event recognition, relation detection and classification including coreference and temporal relations (thyme.healthnlp.org; share.healthnlp.org; cancer.healthnlp.org). The methods are mostly machine learning spanning supervised, lightly supervised and completely unsupervised.

The result of Dr. Savova's research with her collaborators has led to the creation of the clinical Text Analysis and Knowledge Extraction System (cTAKES; ctakes.apache.org). cTAKES is an information extraction system  comprising of a number of NLP components. As would be expected of any biomedical NLP tool, cTAKES can supply commonly extracted biomedical concepts such as symptoms, procedures, diagnoses, medications and anatomy with attributes and standard codes. However, setting it apart from other available biomedical NLP systems that focus on a specific NLP task and domain and are difficult to extend, cTAKES has been engineered in a modular fashion employing the latest machine learning probabilistic methods. These latest and leading edge methods from research investigations have directly been implemented as components in cTAKES. These components can, for instance, identify such things as complex relations between entities (e.g. the location of a tumor). cTAKES can also perform the extremely important task of identifying temporal events, dates and times – resulting in the absolute and relative placement of events in a patient timeline. It is the only biomedical open source NLP system using components with rule-based and supervised methods trained on gold standards from the general as well as the biomedical domain thus affording usability across different types of clinical narrative (e.g. pathology, radiology, clinical notes, etc.) from different institutions as well as other health related narrative (e.g. twitter feeds).

cTAKES has been applied to a number of biomedical use cases to mine the data within the clinical narrative such as i2b2, SHARPn, PGRN, eMERGE, PCORI. Within the Integrating Informatics and Biology to the Bedside (i2b2), cTAKES has been used to extract patient characteristics for determining their status related to a specific phenotype (Multiple Scleroris, Inflamatory Bowel Disease, Type 2 Diabetes). Within the Pharmacogenomics Research Network (PGRN), cTAKES has been applied to automatically determine patient's disease activity and detect responders versus non-responders to a specific treatment. Within the Electronic Medical Record and Genomics (eMERGE), cTAKES has been applied to automatically discover patients with Peripheral Arterial Disease, Autism Spectrum Disorder, Appendicitis, Early Childhood Obesity. Within the Patient-Powered Research Network, cTAKES is applied to create a comprehensive phenotype picture for patients with one very rare disease – Phelan-McDermid Syndrome. cTAKES-extracted data can be embedded in the i2b2 platform as well as PheWAS/GWAS platforms such as tranSMART, thus combining it with genotypic data for even bigger data analysis.

Dr. Guergana Savova is on the editorial board of the Journal of the Medical Informatics Association (JAMIA), and is a reviewer for several journals including Journal of the Biomedical Informatics (JBI), Journal of Language Resources and Evaluation (LREC), and many conferences/workshops. She is also a member of the National Library of Medicine's Biomedical Library and Informatics Review Committee.

Dr. Guergana Savova holds a PhD in Linguistics with a minor in Cognitive Science and a Master’s of Science in Computer Science from University of Minnesota. Before joining Boston Children’s Hospital and Harvard Medical School in 2010, Dr. Savova was faculty at the Biomedical Statistics and Informatics Department, Mayo Clinic (2002-2010).