Reproducibility Resources

This page features resources submitted by U-M data science researchers. Ensuring reproducible data science is no small task: computational environments may vary drastically and can change over time; specialized workflows might require specialized infrastructure not easily available; sensitive projects might involve restricted data; the robustness of algorithmic decisions and parameter selections varies widely; crucial steps (e.g. wrangling, cleaning, mitigating missing data issues, preprocessing) where choices are made might not be well-documented. Our resource collection will help researchers tackle some of these challenges. If you would like to submit tools, publications and other resources to be included in this page, please email