Reproducibility Challenge

2021 Reproducibility Challenge

The MIDAS Reproducibility Challenge II is open to researchers from any field that makes use of data, broadly construed. We seek entries in four categories:

Training materials for reproducible research. These should be well developed materials for courses or workshops, with a clear theme that can be easily adopted by other instructors.

Guides, processes and tools that can be readily adopted by researchers in one or multiple research areas. Other researchers, even if only within a narrowly defined research field, should be able to follow such guides, processes and tools to make their work reproducible.

A template for a reproducible project. The goal is to allow others to emulate and develop similar approaches and lower the burden of making their projects reproducible. We especially welcome submissions that seek to make analyses not only reproducible, but transparent. This might include especially well organized or well documented code (for example, using notebooks); saving intermediate datasets / results so that they can be easily inspected; code that draws attention to and allows easy modification of analytical choices such as statistical tuning parameters, data cleaning choices, etc.

An analysis of reproducibility in a research field, including meta-analysis or reproducing published works, to identify good practices for and challenges to replication and confirmation of published works, and proposals for new practices and tools to overcome barriers. A research field needs to be well defined, but can be broad or narrow, e.g. “single-cell RNAseq”, or “child development research with video data”.

A significant challenge across scientific fields is the reproducibility of research results, in both the narrow sense of repeating the calculations using the same data, and in the broader sense of deriving generalized insight from one or more investigations that bear on a common question. Ensuring that data science research results can be reliably reproduced is particularly challenging, especially as data science projects involve increasingly complex data and pipelines. In addition, when data science methodology is applied to vastly different research domains, the reproducibility of results hinges on domain-specific factors as well as components of the data science methodology.  

The 2021 MIDAS Reproducibility Challenge winners bring to us principled yet creative work. Krüger and Nordås provide a template for a principled approach to organize, document and share a computational project. Teplitskiy and Evans use an automated and rigorous method to assess the reproducibility and replicability of survey data research and delineate causes for the lack of replicability.. Zhao and Wen propose a Bayesian model criticism approach to assess the replicability of a body of research studies that address the same scientific question with variations in data sets and analytical methods. 

Through the 2020 and 2021 Reproducibility Challenges, MIDAS seeks exemplary work from our researchers to make data science projects more reproducible, and research outcomes more robust. We now have a wonderful collection of our researchers’ work to improve reproducibility in data science, and we are building collaboration with these researchers to enable the wide adoption of best practices. 

Each of the winning teams will receive a cash prize. Later in the semester, they will present their work through a series of video showcases. Their projects will also be added to the MIDAS Reproducibility Hub. 

Congratulations again to the winners.

MIDAS Reproducibility Challenge Winners

March 3rd, 2022 – MIDAS is excited to announce the 2021 Reproducibility Challenge winners and congratulates them for their outstanding work to make data science research more reproducible.

Statistical Methods for Replicability Assessment

Yi Zhao (Biostatistics)
Xiaoquan Wen (Biostatistics)

recording

Assessment of replicability is critical to ensure the quality and rigor of scientific research. In this work, we discuss inference and modeling principles for replicability assessment. Targeting distinct application scenarios, we propose two types of Bayesian model criticism approaches to identify potentially irreplicable results in scientific experiments. They are motivated by established Bayesian prior and posterior predictive model-checking procedures and generalize many existing replicability assessment methods. Finally, we discuss the statistical properties of the proposed replicability assessment approaches and illustrate their usages by an example of real data analysis, i.e., a systematic reanalysis of data from the Reproducibility Project: Psychology.

Data Processing Principles for Auditable and Reproducible Research

Jule Krüger (Center for Political Studies)
Ragnhild Nordås (Political Science)

recording

In this project, we demonstrate a principled approach to data processing that allows researchers to automate their empirical research and make it scalable, auditable and reproducible. The approach comprises several principles: conceptualizing an empirical research project as tasks, organizing the project directory along these tasks, separating tasks into input files, source code, manual work and output files, using makefiles to automate and document the data workflow with regard to targets and dependencies, manually separating constants from code, adopting naming conventions that are self-explanatory, version controlling one’s work and publishing a public repository. Combined, these principles offer several benefits. They provide transparency about how the research data was manipulated either manually or computationally. They also quickly convey the role of each project file based on its name and location in the project directory and easily enable sensitivity analyses of modeling parameters and coding decisions. A practical example of this principled approach to data processing is available at https://github.com/juleka/JPR-LVM-SVAC.

How Firm is Sociological Knowledge: Reanalysis of GSS findings with alternative models and out-of-sample data, 1972-2012

Misha Teplitskiy (School of Information)
James Evans (Sociology, University of Chicago)

recording

This project proposes a method to empirically measure the replicability of social science literatures that rely on survey data. Published findings in such literatures may be fragile because hypotheses were tailored to fit the data (“data mining” or “p- hacking”) or because the world has changed and once robust relationships no longer hold (“social change”). The proposed method measures failure to replicate due to both causes. To exemplify the method, we apply it to findings from hundreds of publications that use the General Social Survey, 1972 – 2012. To measure the consequences of data mining, we estimate the published regression models on the original data and compare them to alternative specifications on the same data. To measure the consequences of social change, we estimate the published models on the original data and compare them to the same models estimated on subsequent waves of the GSS, which came out after publication. There is evidence that both mechanisms decrease replicability: the number of significant coefficients, standardized coefficient sizes, and R2 are significantly reduced in both analyses. However, the reduction in significance levels and effect sizes is larger for the social change mechanism. Our findings suggest that social scientists are engaged in only moderate data mining; a bigger concern is the decreased relevance of older published knowledge to the contemporary world.

Planning Committee

George Alter

Research Professor, ICPSR
Professor of History

Jacob Carlson

Director of Deep Blue Repository and Research Data Services, U-M

Johann Gagnon Bartsch

Assistant Professor, Statistics

Bryan Goldsmith

Assistant Professor, Chemical Engineering

H.V. Jagadish

MIDAS Director
Professor, Electrical Engineering and Computer Science

Jing Liu

Managing Director, MIDAS

Thomas Valley

Assistant Professor, U-M Department of Internal Medicine

2020 Reproducibility Challenge

A significant challenge across scientific fields is the reproducibility of research results, and third-party assessment of such reproducibility. Ensuring that results can be reliably reproduced is no small task: computational environments may vary drastically and can change over time, rendering code unable to run; specialized workflows might require specialized infrastructure not easily available; sensitive projects might involve data that cannot be directly shared; the robustness of algorithmic decisions and parameter selections varies widely; data collection methods may include crucial steps (e.g. wrangling, cleaning, missingness mitigation strategies, preprocessing) where choices are made but not well-documented. Yet a cornerstone of science remains the ability to verify and validate research findings, so it is important to find ways to overcome these challenges.

The first MIDAS Reproducibility Challenge was held in the first 8 months of 2020. Our goal was to highlight high-quality, reproducible work at the University of Michigan by collecting examples of best practices across diverse fields. Besides incentivizing reproducible workflows and enabling a deeper understanding of issues of reproducibility, the results of the challenge also provide templates that others can follow.

Judges

  • Jake Carlson (Manager, Deep Blue Repositories and Research Data Services, U-M Libraries)
  • H.V. Jagadish (Director, MIDAS, and Professor, Computer Science and Engineering, CoE)
  • Matthew Kay (Assistant Professor, School of Information)
  • Jing Liu (Managing Director, MIDAS)
  • Josh Pasek (Assistant Professor, Communication and Media, LSA)
  • Brian Puchala (Assistant Research Scientist, Materials Science and Engineering, CoE)
  • Arvind Rao (Associate Professor, Computational Medicine and Bioinformatics, and Radiation Oncology, Med. School)

MIDAS Reproducibility Challenge Winners

View Reproducibility Day Recording

Category B

Exact reproducibility

Everyday Reproducibility: A multi-pronged approach to ensure analyses are fully reproducible, easy to access, and easy to use

Johann A. Gagnon-Bartsch (Statistics, University of Michigan)
Yotam Shem-Tov (Economics, UCLA)
Gregory J. Hunt (Mathematics, College of William & Mary)
Mark A. Dane (Biomedical Engineering, Oregon Health & Science University)
James E. Korkola (Biomedical Engineering, Oregon Health & Science University)
Laura M. Heiser (Biomedical Engineering, Oregon Health & Science University)
Saskia Freytag (Medical Biology, University of Melbourne)
Melanie Bahlo (Medical Biology, University of Melbourne)

Category B

Exact reproducibility

Statistical code sharing: a guide for clinical researchers

Thomas S. Valley (Internal Medicine, University of Michigan)
Neil Kamdar (Institute for Healthcare Policy and Innovation, University of Michigan)
Wyndy L. Wiitala (VA Center for Clinical Management Research)
Andrew M. Ryan (Institute for Healthcare Policy and Innovation, University of Michigan)
Sarah M. Seelye (VA Center for Clinical Management Research)
Akbar K. Waljee (Biomedical Engineering, Oregon Health & Science University)
Saskia Freytag (Internal Medicine, University of Michigan)
Brahmajee K. Nallamothu (Internal Medicine, University of Michigan)

Category C

Generalizable tools

Reproducible Materials Simulation and Analysis Workflows

Sharon C. Glotzer (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Karen Coulter (Chemical Engineering, Glotzer Group Lab, University of Michigan)
Joshua Anderson (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Timothy Moore (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Allen LaCour (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Kelly Wang (Macromolecular Science and Engineering, Biointerfaces Institute, University of Michigan)

Category D

Robustness

Translating Strategies for Promoting Engagement in Mobile Health: A Micro-randomized Feasibility Trial

Inbal Nahum-Shani (ISR, University of Michigan)
Mashfiqui Rabbi (Statistics, Harvard University)
Jamie Yap (ISR, University of Michigan)
Meredith L. Philyaw-Kotov (Psychiatry and Addiction Center, University of Michigan)
Predrag Klasnja (School of Information, University of Michigan)
Erin E. Bonar (Psychiatry and Addiction Center, University of Michigan)
Rebecca M. Cunningham (Vice President of Research, University of Michigan)
Susan A. Murphy (Statistics & Computer Science, Harvard University)
Maureen A. Walton (Psychiatry and Addiction Center, University of Michigan)

MIDAS Reproducibility Challenge Honorable Mentions

Category A

Theory

INTRIGUE: Quantify and Control Reproducibility in High-throughput Experiments

Johann A. Gagnon-Bartsch (Statistics, University of Michigan)
Yotam Shem-Tov (Economics, UCLA)
Gregory J. Hunt (Mathematics, College of William & Mary)
Mark A. Dane (Biomedical Engineering, Oregon Health & Science University)
James E. Korkola (Biomedical Engineering, Oregon Health & Science University)
Laura M. Heiser (Biomedical Engineering, Oregon Health & Science University)
Saskia Freytag (Medical Biology, University of Melbourne)
Melanie Bahlo (Medical Biology, University of Melbourne)

Category B

Exact reproducibility

Statistical code sharing: a guide for clinical researchers

Thomas S. Valley (Internal Medicine, University of Michigan)
Neil Kamdar (Institute for Healthcare Policy and Innovation, University of Michigan)
Wyndy L. Wiitala (VA Center for Clinical Management Research)
Andrew M. Ryan (Institute for Healthcare Policy and Innovation, University of Michigan)
Sarah M. Seelye (VA Center for Clinical Management Research)
Akbar K. Waljee (Biomedical Engineering, Oregon Health & Science University)
Saskia Freytag (Internal Medicine, University of Michigan)
Brahmajee K. Nallamothu (Internal Medicine, University of Michigan)

Category C

Generalizable tools

Reproducible Materials Simulation and Analysis Workflows

Sharon C. Glotzer (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Karen Coulter (Chemical Engineering, Glotzer Group Lab, University of Michigan)
Joshua Anderson (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Timothy Moore (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Allen LaCour (Chemical Engineering, Biointerfaces Institute, University of Michigan)
Kelly Wang (Macromolecular Science and Engineering, Biointerfaces Institute, University of Michigan)

Category D

Robustness

Translating Strategies for Promoting Engagement in Mobile Health: A Micro-randomized Feasibility Trial

Inbal Nahum-Shani (ISR, University of Michigan)
Mashfiqui Rabbi (Statistics, Harvard University)
Jamie Yap (ISR, University of Michigan)
Meredith L. Philyaw-Kotov (Psychiatry and Addiction Center, University of Michigan)
Predrag Klasnja (School of Information, University of Michigan)
Erin E. Bonar (Psychiatry and Addiction Center, University of Michigan)
Rebecca M. Cunningham (Vice President of Research, University of Michigan)
Susan A. Murphy (Statistics & Computer Science, Harvard University)
Maureen A. Walton (Psychiatry and Addiction Center, University of Michigan)