Chris Brooks
Assistant Professor of Information – School of Information, University of Michigan
in Collaboration with Josh Gardner (University of Washington) and Ryan Baker (University of Pennsylvania)
replicate.education: lessons learned building a platform for educational data science replications
View Recording
Abstract: This presentation will outline a program of reproducibility research in the domain of learning analytics with a specific focus on predictive models of student success. In it we will detail three contributions made to the field over the past two years: (a) reproductions of work from other scholars in the field, leading to an understanding of the (lack of) generalizability of findings reported and contributing more clear understanding of the nuance behind said activities, (b) dissemination and engagement with the Educational Data Mining (EDM) and Learning Analytics and Knowledge (LAK) communities through the replicate.education workshops which surfaced both problems and solutions around reproducing student success models, and (c) an open source software infrastructure, the MOOC Replication Framework (MORF), which can be used to reproduce results on larger scale datasets.
The Reproducibility Showcase features a series of online presentations and tutorials from May to August, 2020. Presenters are selected from the MIDAS Reproducibility Challenge 2020.
A significant challenge across scientific fields is the reproducibility of research results, and third-party assessment of such reproducibility. The goal of the MIDAS Reproducibility Challenge is to highlight high-quality, reproducible work at the University of Michigan by collecting examples of best practices across diverse fields. We received a large number of entries that illustrate wonderful work in the following areas:
- Theory – A definition of reproducibility and what aspects of reproducibility are critical in a particular domain or in general.
- Reproducing a Particular Study – Comprehensive record of parameters and code that allows for others to reproduce the results in a particular project.
- Generalizable Tools – A general platform for coding or running analyses that standardizes the methods for reproducible results across studies.
- Robustness – Metadata, tools and processes to improve the robustness of results to variations in data, computational hardware and software, and human decisions.
- Assessments of Reproducibility – Methods to test the consistency of results from multiple projects, such as meta-analysis or the provision of parameters that can be compared across studies.
- Reproducibility under Constraints – Sharing code and/or data to reproduce results without violating privacy or other restrictions.
On Sept. 14, 2020, MIDAS will also host a Reproducibility Day, which is a workshop on concepts and best practices of research reproducibility. Please save the date on your calendar.