(734) 764-9418
Applications: Economics, Human Trafficking Methodologies: Big Data, Feature Engineering, Information Extraction

Michael Cafarella

Assistant Professor, Electrical Engineering and Computer Science

Affiliation(s):

Institute for Social Research

My research focuses on data management problems that arise from extreme diversity in large data collections. Big data is not just big in terms of bytes, but also type (e.g., a single hard disk likely contains relations, text, images, and spreadsheets) and structure (e.g., a large corpus of relational databases may have millions of unique schemas). As a result, certain long-held assumptions — e.g., that the database schema is always known before writing a query — are no longer useful guides for building data management systems. As a result, my work focuses heavily on information extraction and data mining methods that can either improve the quality of existing information or work in spite of lower-quality information.

A peek inside a Michigan data center! My students and I visit whenever I am teaching EECS485, which teaches many modern data-intensive methods and their application to the Web.

A peek inside a Michigan data center! My students and I visit whenever I am teaching EECS485, which teaches many modern data-intensive methods and their application to the Web.