Sharing Sensitive Data with Confidence: The DataTags System

Presentation Date: 

Monday, December 14, 2015


Harvard University

Presentation Slides: 

Society generates data on a scale previously unimagined. Wide sharing of these data promises to improve important aspects of life such as health and education, by increasing their quality and lowering their cost. However, these same data often include sensitive information about people that could cause serious harms if shared widely. A multitude of regulations, laws and best practices protect data that contain sensitive personal information. Government agencies, research labs, and corporations that share data, as well as review boards and privacy officers making data sharing decisions, are vigilant but uncertain. This uncertainty creates a tendency not to share data at all. Some data are more harmful than other data; sharing should not be an all-or-nothing choice. How do we share data in ways that ensure access is commensurate with risks of harm? In this talk, we introduce datatags as a means of specifying security and access requirements for sensitive data. The datatags approach reduces the complexity of thousands of data-sharing regulations to a small number of tags. We will discuss implementation details for medical and educational data and for research and corporate repositories.


Mercè Crosas is the Chief Data Science and Technology Officer at the Institute for Quantitative Social Science (IQSS) at Harvard University. Together with the Director of IQSS, she leads the vision and strategic direction of all data sharing and analysis projects at IQSS, including the Dataverse project for publishing and archiving research data, the Zelig project for statistical analysis, and the Consilience project for text analysis. Her team includes research data scientists and information scientists. More information of Dr. Crosas is available at

Michael Bar-Sinai is a PhD candidate in Computer Science at the Ben-Gurion University of the Negev, Israel, and a fellow at the Institute for Quantitative Social Science at Harvard University. His research interests include programming languages, software engineering, and issues laying at the intersection of society and software systems, such as privacy.

Latanya Sweeney is Professor of Government and Technology in Residence at Harvard University, Director of the Data Privacy Lab at Harvard, Editor-in-Chief of Technology Science, and was formerly the Chief Technology Officer of the U.S. Federal Trade Commission. More information about Dr. Sweeney is available at