Data and AI Intensive Research with Rigor and Reproducibility
The rigor of scientific research and the reproducibility of research results are essential for the validity of research findings and the trustworthiness of science. However, research rigor and reproducibility remains a significant challenge across scientific fields, especially for research with complex data types from heterogeneous sources, and long data manipulation pipelines. This is especially critical as data science and Artificial Intelligence (AI) methods emerge at lightning speed and researchers scramble to seize the opportunities that the new methods bring.Â
While researchers recognize the importance of rigor and reproducibility, they often lack the resources and the technical know-how to achieve this consistently in practice. With funding from the National Institutes of Health, a multi-university team will develop a nationwide program to equip biomedical researchers with the skills needed to improve the rigor and reproducibility of their research, and help them transfer such skills to their trainees.Â
The new Data and AI Intensive Research with Rigor and Reproducibility (DAIR3) program will include annual bootcamps that focus on ethical issues in biomedical data science; data management, representation, and sharing; rigorous analytical design; the design and reporting of AI models; reproducible workflow; and assessing findings across studies. Trainees will then be guided over a one-year period to incorporate the newly acquired mindset, skills and tools into their research; and develop training for their own institutions. Eventually, the program team will develop online courses based on the training materials developed for the bootcamps, and will offer online instructions in both English and Spanish.
The DAIR3 team and instructors include faculty and staff research leaders from the University of Michigan, the College of William and Mary, Jackson State University (a Historically Black University), and University of Texas San Antonio (a Hispanic-Serving Institution). This highly diverse team will model the culture of diversity that we promote, and will support trainees who are demographically, professionally and scientifically diverse, and are from a diverse range of institutions, including those with limited resources.
The first round of bootcamps will be offered in the summer of 2024 for 100 attendees, with 50 full scholarships to support trainees from Minority-Serving Institutions, underrepresented demographic groups, and resource-constrained institutions. Applications will be accepted starting in Dec., 2023.
Program Team
Victoria Bigelow
Evaluation Coordinator and Adjunct Lecturer in Educational Studies, Marsal School of Education, University of Michigan
Johann Gagnon-Bartsch
Associate Professor of Statistics, College of Literature, Science, and the Arts, University of Michigan
Juan Gutierrez
co-Principal Investigator
Professor, Chair of Mathematics, University of Texas at San Antonio
H. V. Jagadish
Edgar F Codd Distinguished University Professor & Bernard A Galler Collegiate Professor of Elec. Eng. and Computer Science; MIDAS Director, University of Michigan
Arvind Rao
Associate Professor of Computational Medicine and Bioinformatics; Associate Professor of Radiation Oncology, Medical School and Associate Professor of Biostatistics, School of Public Health, University of Michigan
Kerby Shedden
Professor of Statistics, College of Literature, Science, and the Arts; Professor of Biostatistics, School of Public Health and Center Director, Statistical Consultation and Research, University of Michigan
Matthew VanEseltine
Assistant Research Scientist, Survey Research Center, Institute for Social Research, University of Michigan