Crosas M, Honaker J, King G, Sweeney L. Automating Open Science for Big Data. ANNALS of the American Academy of Political and Social Science [Internet]. 2015;659 (1) :260-273. Publisher's VersionAbstract

The vast majority of social science research uses small (megabyte- or gigabyte-scale) datasets. These fixed-scale datasets are commonly downloaded to the researcher’s computer where the analysis is performed. The data can be shared, archived, and cited with well-established technologies, such as the Dataverse Project, to support the published results. The trend toward big data—including large-scale streaming data—is starting to transform research and has the potential to impact policymaking as well as our understanding of the social, economic, and political problems that affect human societies. However, big data research poses new challenges to the execution of the analysis, archiving and reuse of the data, and reproduction of the results. Downloading these datasets to a researcher’s computer is impractical, leading to analyses taking place in the cloud, and requiring unusual expertise, collaboration, and tool development. The increased amount of information in these large datasets is an advantage, but at the same time it poses an increased risk of revealing personally identifiable sensitive information. In this article, we discuss solutions to these new challenges so that the social sciences can realize the potential of big data.

Pepe A, Goodman A, Muench A, Crosas M, Erdmann C. How Do Astronomers Share Data? Reliability and Persistence of Datasets Linked in AAS Publications and a Qualitative Study of Data Practices among US Astronomers. PLoS ONE. 2014;9.
Goodman A, Pepe A, Blocker AW, Borgman CL, Cranmer K, Crosas M, Di Stefano R, Gil Y, Groth P, Hedstrom M, et al. Ten simple rules for the care and feeding of scientific data. PLoS computational biology. 2014;10.
Crosas M. A data sharing story. Journal of eScience Librarianship [Internet]. 2013;1 :7. Publisher's Version
Altman M, Crosas M. The evolution of data citation: From principles to implementation. IASSIST Quarterly [Internet]. 2013;37. Publisher's Version
Rajasekar A, Sankaran S, Lander H, Carsey T, Crabtree J, Crosas M, King G, Kum H-C, Zhan J. Sociometric Methods for Relevancy Analysis of Long Tail Science Data, in Social Computing (SocialCom), 2013 International Conference on. IEEE ; 2013 :1–6.
Knapp GR, Crosas M, Young K, Ivezić Ż. Atomic carbon in the envelopes of carbon-rich post-asymptotic giant branch stars. The Astrophysical Journal. 2000;534 :324.
Knapp GR, Young K, Crosas M. The Circumstellar Envelope of pi Gru. arXiv preprint astro-ph/9903338. 1999.
Sakamoto K, Scoville NZ, Yun MS, Crosas M, Genzel R, Tacconi LJ. Counterrotating nuclear disks in Arp 220. The Astrophysical Journal. 1999;514 :68.
Wood K, Crosas M, Ghez A. GG Tauri's Circumbinary Disk: Models for Near-Infrared Scattered-Light Images and 13CO (J= 1→ 0) Line Profiles. The Astrophysical Journal. 1999;516 :335.
Knapp GR, Dobrovolsky SI, Ivezic Z, Young K, Crosas M, Mattei JA, Rupen MP. The light curve and evolutionary status of the carbon star V Hya. arXiv preprint astro-ph/9907234. 1999.
Crosas M, Menten KM, Young K, Phillips TG. Radiative Transfer in a Turbulent Expanding Molecular Envelope: Application to Mira. In: Dust and Molecules in Evolved Stars. Springer ; 1998. pp. 189–192.
Crosas M, Weisheit J. Spallation in Active Galactic Nuclei. The Astrophysical Journal. 1996;465 :659.
Crosas M, Weisheit J. Cosmic Rays in AGNs. Revista Mexicana de Astronomia y Astrofisica. 1993;27 :107.
Crosas M, Weisheit JC. Hydrogen molecules in quasar broad-line regions. Monthly Notices of the Royal Astronomical Society. 1993;262 :359–368.