User Generated Content, which can range from social media discussions to product reviews to private physician notes, present naturally occurring data that can be used to develop large-scale Machine Learning algorithms for effective processing of human language. My general research interest is in developing linguistically-aware and cognitively-motivated Machine Learning algorithms for problems that arise in the context of Biomedical Informatics with a specific focus on Undiagnosed Diseases.

I'm part of Zaklab and work with Dr. Isaac Kohane and Dr. Andrew Beam on the above problems. Please read more about my research here, and find my CV here.



Research Associate, 2018-Present.
Department of Biomedical Informatics (DBMI), Harvard University
Machine learning for AI diagnosis.
Keywords: Biomedical Informatics, Machine Learning in Health, Representation Learning, Multi-task Learning.


Postdoc, 2016-2018.
Computational Health Informatics Program (CHIP), Harvard University
NLP systems for information extraction from EHR and health-related content in social media.
Keywords: Biomedical Informatics, Traditional NLP.


Postdoc, 2014-2016.
Computational Linguistics and Information Processing (CLIP), University of Maryland
Representation learning with applications to churn prediction and sentiment analysis.
Keywords: Representation Learning, Churn Prediction, Sentiment Analysis.

  Research Scientist, 2013-2014.
Institute for Infocomm Research (I2R)
Community detection and brand name disambiguation in social media.
Keywords: Live Social Media Analytics, Community Detection, Brand Monitoring.

Ph.D., 2009-2013.
Advisor: Dr. Tat-Seng Chua
Lab for Media Search (LMS), National University of Singapore
Sentiment analysis and live event detection and tracking (for organizations and businesses) in social media.
Keywords: Live Social Media Analytics, Event Detection and Tracking, Sentiment Analysis.


M.Eng., 2005-2008.
Advisors: Drs. Farhad Oroumchian, and Maseud Rahgozar
Database Research Group (DBRG), University of Tehran
Distributed information retrieval, Persian search and POS tagging. See Bijbakhan and Hamshahri dataset used at CLEF'08 -09.
Keywords: Distributed IR, Persian Text Retrieval and POS tagging, Multilingual Text Retrieval.