CoreLogic aggregates data from individual, parcel-level real estate transactions and financial records We have licensed access to Tax, Deed, and Foreclosure data at the parcel level for every county in the United States.
These records are publicly available and gathered from county record offices across the country. Coverage dates vary by county, some county records go back 50 years. Coverage is more comprehensive from the 1990s to the present. The Tax data file contains only one year of data (for most counties that is 2016)
The dataset consists of multiple pipe-delimited text files organized into Tax, Deed and Foreclosure. Each file covers the whole US.
To access this data, please visit the link below:
If you have any questions about the datasets, please contact a librarian.
The Institute for Healthcare Policy and Innovation (IHPI) has more than 20 terabytes of data, from more than 113 million Americans, for researchers to study how healthcare works and how to make it better. IHPI’s data is provided primarily by large insurance companies in the form of administrative claims. These are proprietary datasets that cover both the commercial and private payer insurance sectors, and also give researchers a longitudinal accounting of millions of US patient’s healthcare utilization patterns.
For questions, please email email@example.com.
To download Lyft’s comprehensive, large-scale dataset featuring raw sensor camera and LiDAR inputs as perceived by a fleet of multiple, high-end, autonomous vehicles in a bounded geographic area
This dataset also includes high quality, human-labelled 3D bounding boxes of traffic agents, an underlying HD spatial semantic map.
Michigan Genomics Initiative (MGI)
The Michigan Genomics Initiative is a collaborative research effort among physicians and researchers at the University of Michigan with the goal of harmonizing patient electronic medical records with genetic data to gain novel biomedical insights.
Twitter Decahose Dataset
University of Michigan researchers can access a compilation of tweets known as the “Decahose” (a 10% sample of all tweets) without charge. MIDAS, CSCAR and ARC-TS together manage and support the use of this data repository, including the historical archive of Decahose tweets and ongoing collection from the Decahose.
U-M researchers can use this set of data for five areas of research: information diffusion; Natural Language Processing; network analysis; behavior analysis; Sociolinguistics.
For questions, please email Kristin Burgard, MIDAS Outreach and Partnership Manager, firstname.lastname@example.org.
Waymo Open Dataset
The Waymo Open Dataset is comprised of high resolution sensor data collected by Waymo self-driving cars in a wide variety of conditions. The company is releasing this dataset publicly to aid the research community in making advancements in machine perception and self-driving technology.