Reproducibility in Computational Science with Victoria Stodden (UIUC)
Event Recap
Reproducibility is Not a Crisis. Now What? Next Steps for Advancing Computational and Data-enabled Science
“There is no crisis, but also no time for complacency” said the chair of the National Academies of Sciences, Engineering, and Medicine (NASEM) committee on “Reproducibility and Replicability in Science” in May 2019 (https://vimeo.com/335923468). Questions regarding reproducibility have arisen regarding the transparency of computational methods and discovery, in part due to the leveraging of data and compute resources for scientific and engineering advancements, now pervasive in a staggeringly broad range of academic disciplines and activities. In this talk, I present the reproducibility definitions that emerged in our NASEM committee deliberations and discuss an abstract framework for conceptualizing and advancing data science as a discipline, called the Lifecycle of Data Science (forthcoming in CACM). This framework integrates the disparate components of data-enabled discovery, from hardware provisioning to applications to dissemination standards for verification and re-use to ethics, and thereby brings into contextual focus salient issues such as computational reproducibility, standards and policy, and curricular development. I then present the “Knowledge Integrator,” an effort to conceptualize and enable the dissemination of reproducible research results based on the Lifecycle of Data Science and community engagement.