• Sol

    The Sun

  • January

    January

  • February

    February

  • March

    March

  • April

    April

  • May

    May

  • June

    June

  • July

    July

  • August

    August

  • September

    September

  • October

    October

  • November

    November

  • December

    December

I’m  a data technologist and researcher, currently holding two roles at Harvard University, as the University Research Data Management Officer, with Harvard University Information Technology (HUIT), and the Chief Data Science and Technology Officer at Harvard's Institute for Quantitative Social Science. My career journey has included research in astrophysics, design and implementation of software for astronomical observations, development of learning and data management systems for education and biotechnologies, and now leading software platforms and tools for research data sharing and analysis, applied to all research fields. 

What am I interested in? Open science to facilitate access and reuse of research data and code while preserving privacy, build software to enhance the quality and productivity of scientific outcomes,  improve research data management, and establish data-centric multidisciplinary collaborations with the aid of technology and a human touch.

Recent Publications

Repository Approaches to Improving the Quality of Shared Data and Code. MDPI Data [Internet]. 2021. Publisher's VersionAbstract
Sharing data and code for reuse has become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly to the quality, longevity, and reusability of datasets. This paper presents a combination of original and secondary data analysis studies focusing on computational reproducibility, data curation, and gamified design elements that can be employed to indicate and improve the quality of shared data and code. The findings of these studies are sorted into three approaches that can be valuable to data repositories, archives, and other research dissemination platforms.
Qualitative data sharing and synthesis for sustainability science
Alexander S, Jones K, Bennet N, Buden A, Cox M, Crosas M, Game E, Geary J, Hardy D, Johnson J, et al. Qualitative data sharing and synthesis for sustainability science. Nature Sustainability [Internet]. 2020;(3) :81-88. Publisher's VersionAbstract
Socio–environmental synthesis as a research approach contributes to broader sustainability policy and practice by reusing data from disparate disciplines in innovative ways. Synthesizing diverse data sources and types of evidence can help to better conceptualize, investigate and address increasingly complex socio–environmental problems. However, sharing qualitative data for re-use remains uncommon when compared to sharing quantitative data. We argue that qualitative data present untapped opportunities for sustainability science, and discuss practical pathways to facilitate and realize the benefits from sharing and reusing qualitative data. However, these opportunities and benefits are also hindered by practical, ethical and epistemological challenges. To address these challenges and accelerate qualitative data sharing, we outline enabling conditions and suggest actions for researchers, institutions, funders, data repository managers and publishers.
FAIR principles: Interpretations and implementation considerations
Jacobsen A, de Azevedo RM, Juty N, Batista D, Coles S, Cornet R, Courtot M, Crosas M, Dumontier M, et al. FAIR principles: Interpretations and implementation considerations. Dat Intelligence [Internet]. 2020;2 (1-2) :10-29. Publisher's VersionAbstract
The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 FAIR guiding principles do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability and Reusability of digital resources. This has likely contributed to the broad adoption of the FAIR principles, because individual stakeholder communities can implement their own FAIR solutions. However, it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations. Thus, while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways, for true interoperability we need to support convergence in implementation choices that are widely accessible and (re)-usable. We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations. Any self-identified stakeholder community may either choose to reuse solutions from existing implementations, or when they spot a gap, accept the challenge to create the needed solution, which, ideally, can be used again by other communities in the future. Here, we provide interpretations and implementation considerations (choices and challenges) for each FAIR principle.
More

Recent Presentations

OpenDP, an open-source suite of tools for deploying differential privacy, at MLSE 2020, Monday, December 14, 2020:

OpenDP-MLSESince it was introduced in 2006, differential privacy (DP) has become accepted as a gold standard for ensuring that individual-level information is not leaked through statistical analyses or machine learning on sensitive datasets. OpenDP comes at a time when computation and...

Read more about OpenDP, an open-source suite of tools for deploying differential privacy
More

Tweets from @mercecrosas