MIDAS helps U-M researchers develop new datasets and make datasets machine learning ready through training as well as funding opportunities. Examples of datasets that have been supported through MIDAS grants (e.g., the Propelling Original Data Science program) or that have been developed by MIDAS affiliated faculty, postdocs, and other researchers, are highlighted below.

If you have questions about how to make your dataset machine learning ready, or if you have datasets that you would like to share with the data science and AI community, please email midas-research@umich.edu.

Sharing and Preserving Your Data

MIDAS recommends looking into existing data repositories on campus, such as Deep Blue Data (DBD) or the Inter-university Consortium for Political and Social Research (ICPSR) as a resource to share and preserve your research data. Deep Blue is a repository for data developed at the University of Michigan and is offered by the University of Michigan Library. They offer various services and consultation to depositors. ICPSR is an international consortium of more than 750 academic institutions and research organizations that is housed at the Institute for Social Research at U-M and offers deposit services for both public-use and restricted-use datasets for the social science community. ICPSR also has a variety of services and resources available for depositors.

Social Media Data

Social media data availability has undergone major changes recently. For those who use social media data for research, please see the Social Media Archives (SOMAR) and additional resources available through ICPSR, including the Meta Content Library and API.

University of Michigan Library Data Resources and Services

The U-M Library purchases and licenses access to data in a variety of formats to support the academic research of U-M faculty, students, and staff. Start here if you are seeking a dataset in a particular topical area, if you need support accessing a dataset (including commercially available data sets not yet licensed by the library), or for other resources and support to access, organize, visualize, and preserve research data.


Dataset: Index to Loans on Veterans Administration Guaranteed Mortgages, [United States], 1946-1954

Project: Images to Integrated Data: Piloting new methods to digitize, parse, and link historical records

Description: The Images to Integrated Data project (I2I) team, funded through the MIDAS PODS program, developed a workflow for digitizing, parsing, and linking physical historical records. They used a collection of 25,744 index card records from the administration of G.I. Bill Mortgages from 1946 to 1954, housed at the National Archives in College Park, Maryland. This enabled a follow up project utilizing these records to analyze racial equity in the mortgage guarantee program’s implementation and the impact of unequal implementation on Black veterans and communities.

Project team: J. Trent Alexander, Sara Lafia, David Bleckley (ICPSR, University of Michigan) and Katie Genadek (U.S. Census)

Link: https://doi.org/10.3886/ICPSR38906.v1

Code: https://github.com/ICPSR/gi-bill

Related publications: https://doi.org/10.1108/jd-03-2023-0055

“The PODS grant allowed us to digitize and improve a physical resource that has sat unused at the National Archives for over 70 years.”