• Sol

    The Sun

  • January


  • February


  • March


  • April


  • May


  • June


  • July


  • August


  • September


  • October


  • November


  • December


I’m  a data technologist and researcher, currently holding two roles at Harvard University, as the University Research Data Management Officer, with Harvard University Information Technology (HUIT), and the Chief Data Science and Technology Officer at Harvard's Institute for Quantitative Social Science. My career journey has included research in astrophysics, design and implementation of software for astronomical observations, development of learning and data management systems for education and biotechnologies, and now leading software platforms and tools for research data sharing and analysis, applied to all research fields. 

What am I interested in? Open science to facilitate access and reuse of research data and code while preserving privacy, build software to enhance the quality and productivity of scientific outcomes,  improve research data management, and establish data-centric multidisciplinary collaborations with the aid of technology and a human touch.

Recent Publications

Qualitative data sharing and synthesis for sustainability science
Alexander S, Jones K, Bennet N, Buden A, Cox M, Crosas M, Game E, Geary J, Hardy D, Johnson J, et al. Qualitative data sharing and synthesis for sustainability science. Nature Sustainability [Internet]. 2020;(3) :81-88. Publisher's VersionAbstract
Socio–environmental synthesis as a research approach contributes to broader sustainability policy and practice by reusing data from disparate disciplines in innovative ways. Synthesizing diverse data sources and types of evidence can help to better conceptualize, investigate and address increasingly complex socio–environmental problems. However, sharing qualitative data for re-use remains uncommon when compared to sharing quantitative data. We argue that qualitative data present untapped opportunities for sustainability science, and discuss practical pathways to facilitate and realize the benefits from sharing and reusing qualitative data. However, these opportunities and benefits are also hindered by practical, ethical and epistemological challenges. To address these challenges and accelerate qualitative data sharing, we outline enabling conditions and suggest actions for researchers, institutions, funders, data repository managers and publishers.
FAIR principles: Interpretations and implementation considerations
Jacobsen A, de Azevedo RM, Juty N, Batista D, Coles S, Cornet R, Courtot M, Crosas M, Dumontier M, et al. FAIR principles: Interpretations and implementation considerations. Dat Intelligence [Internet]. 2020;2 (1-2) :10-29. Publisher's VersionAbstract
The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 FAIR guiding principles do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability and Reusability of digital resources. This has likely contributed to the broad adoption of the FAIR principles, because individual stakeholder communities can implement their own FAIR solutions. However, it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations. Thus, while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways, for true interoperability we need to support convergence in implementation choices that are widely accessible and (re)-usable. We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations. Any self-identified stakeholder community may either choose to reuse solutions from existing implementations, or when they spot a gap, accept the challenge to create the needed solution, which, ideally, can be used again by other communities in the future. Here, we provide interpretations and implementation considerations (choices and challenges) for each FAIR principle.
Evaluating FAIR maturity through a scalable, automated, community-governed framework
Wilkinson MD, Dumontier M, Sansone S-A, Olavo L, Prieto M, Batista D, McQuilton P, Kuhn T, Rocca-Serra P, Crosas M, et al. Evaluating FAIR maturity through a scalable, automated, community-governed framework. Nature-Springer Scientific Data [Internet]. 2019;6 (174). Publisher's VersionAbstract
Transparent evaluations of FAIRness are increasingly required by a wide range of stakeholders, from scientists to publishers, funding agencies and policy makers. We propose a scalable, automatable framework to evaluate digital resources that encompasses measurable indicators, open source tools, and participation guidelines, which come together to accommodate domain relevant community-defined FAIR assessments. The components of the framework are: (1) Maturity Indicators – community-authored specifications that delimit a specific automatically-measurable FAIR behavior; (2) Compliance Tests – small Web apps that test digital resources against individual Maturity Indicators; and (3) the Evaluator, a Web application that registers, assembles, and applies community-relevant sets of Compliance Tests against a digital resource, and provides a detailed report about what a machine “sees” when it visits that resource. We discuss the technical and social considerations of FAIR assessments, and how this translates to our community-driven infrastructure. We then illustrate how the output of the Evaluator tool can serve as a roadmap to assist data stewards to incrementally and realistically improve the FAIRness of their resources.

Recent Presentations

OpenDP, an open-source suite of tools for deploying differential privacy, at MLSE 2020, Monday, December 14, 2020:

OpenDP-MLSESince it was introduced in 2006, differential privacy (DP) has become accepted as a gold standard for ensuring that individual-level information is not leaked through statistical analyses or machine learning on sensitive datasets. OpenDP comes at a time when computation and...

Read more about OpenDP, an open-source suite of tools for deploying differential privacy
Serveis de dades i computació per al suport del cicle de recerca a una universitat, at CSUC, Tuesday, December 1, 2020:

ResearchServicesThis talk, entitled Research Data and Computing Services to support the research lifecycle in a University, is part of a conference series on Research Data Management organized by the Consorci de Serveis Universitaris de Catalunya (CSUC). (The talk is given in Catalan but the slides are in English). The other two talks of this series are on the Data Commons and FAIR Principles with Dataverse.

... Read more about Serveis de dades i computació per al suport del cicle de recerca a una universitat


Tweets from @mercecrosas