J.J. Prescott

By |

Broadly, I study legal decision making, including decisions related to crime and employment. I typically use large social science data bases, but also collect my own data using technology or surveys.

Lu Wang

By |

Lu’s research is focused on natural language processing, computational social science, and machine learning. More specifically, Lu works on algorithms for text summarization, language generation, argument mining, information extraction, and discourse analysis, as well as novel applications that apply such techniques to understand media bias and polarization and other interdisciplinary subjects.

Edgar Franco-Vivanco

By |

Edgar Franco-Vivanco is an Assistant Professor of Political Science and a faculty associate at the Center for Political Studies. His research interests include Latin American politics, historical political economy, criminal violence, and indigenous politics.

Prof. Franco-Vivanco is interested in implementing machine learning tools to improve the analysis of historical data, in particular handwritten documents. He is also working in the application of text analysis to study indigenous languages. In a parallel research agenda, he explores how marginalized communities interact with criminal organizations and abusive policing in Latin America. As part of this research, he is using NLP tools to identify different types of criminal behavior.

Examples of the digitization process of handwritten documents from colonial Mexico.

Elle O’Brien

By |

My research focuses on building infrastructure for public health and health science research organizations to take advantage of cloud computing, strong software engineering practices, and MLOps (machine learning operations). By equipping biomedical research groups with tools that facilitate automation, better documentation, and portable code, we can improve the reproducibility and rigor of science while scaling up the kind of data collection and analysis possible.

Research topics include:
1. Open source software and cloud infrastructure for research,
2. Software development practices and conventions that work for academic units, like labs or research centers, and
3. The organizational factors that encourage best practices in reproducibility, data management, and transparency

The practice of science is a tug of war between competing incentives: the drive to do a lot fast, and the need to generate reproducible work. As data grows in size, code increases in complexity and the number of collaborators and institutions involved goes up, it becomes harder to preserve all the “artifacts” needed to understand and recreate your own work. Technical AND cultural solutions will be needed to keep data-centric research rigorous, shareable, and transparent to the broader scientific community.

View MIDAS Faculty Research Pitch, Fall 2021


Anne Fernandez

By |

Dr. Fernandez is a clinical psychologist with extensive training in both addiction and behavioral medicine. She is the Clinical Program Director at the University of Michigan Addiction Treatment Service. Her research focuses on the intersection of addiction and health across two main themes: 1) Expanding access to substance use disorder treatment and prevention services particularly in healthcare settings and; 2) applying precision health approaches to addiction-related healthcare questions. Her current grant-funded research includes an NIH-funded randomized controlled pilot trial of a preoperative alcohol intervention, an NIH-funded precision health study to leverage electronic health records to identify high-risk alcohol use at the time of surgery using natural language processing and other machine-learning based approaches, a University of Michigan funded precision health award to understand and prevent new persistent opioid use after surgery using prediction modeling, and a federally-funded evaluation of the state of Michigan’s substance use disorder treatment expansion.

Sara Lafia

By |

I am a Research Fellow in the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan. My research is currently supported by a NSF project, Developing Evidence-based Data Sharing and Archiving Policies, where I am analyzing curation activities, automatically detecting data citations, and contributing to metrics for tracking the impact of data reuse. I hold a Ph.D. in Geography from UC Santa Barbara and I have expertise in GIScience, spatial information science, and urban planning. My interests also include the Semantic Web, innovative GIS education, and the science of science. I have experience deploying geospatial applications, designing linked data models, and developing visualizations to support data discovery.

Xu Shi

By |

My methodological research focus on developing statistical methods for routinely collected healthcare databases such as electronic health records (EHR) and claims data. I aim to tackle the unique challenges that arise from the secondary use of real-world data for research purposes. Specifically, I develop novel causal inference methods and semiparametric efficiency theory that harness the full potential of EHR data to address comparative effectiveness and safety questions. I develop scalable and automated pipelines for curation and harmonization of EHR data across healthcare systems and coding systems.

Nambi Nallasamy

By |

Our team develops machine learning algorithms for the enhancement of outcomes in cataract surgery, the most commonly performed surgery in the world. Our works focuses on developing models for postoperative refraction after cataract surgery and analysis of surgical quality.

Frederick George Conrad

By |

Fred Conrad’s research concerns the development of new methods and data sources for conducting social research. His work is largely focused on survey methodology, but he also explores the use of social media content as a complement to survey data and as a source of large-scale qualitative insights. His focus is on data quality and reducing measurement error. For example, live video interviews promote more thoughtful responses, e.g., less straightlining – the tendency to give the same answer to a battery of survey questions, but they also promote less candor when answering questions on sensitive topics. Measurement error in social media include misclassification in the automated interpretation of content using methods such as sentiment analysis and topic modeling, as well as selective self-presentation (only posting flattering content). Equally challenging is not knowing the extent to which users differ from the population to which one might wish to generalize results.

Lisa Levinson

By |

My research interests are in natural language semantics and psycholinguistics, focusing on verbs. I conduct behavioral psycholinguistic experiments with methodologies such as self-paced reading and maze tasks, as well as surveys of linguistic and semantic judgments. I also study semantic variation using corpora and datasets such as the Twitter Decahose, to better understand how words have developed diverging meanings in different communities, age groups, or regions. I use primarily R and Python to collect, manage, and analyze data. I direct the UM WordLab in the linguistics department, working with students (especially undergraduates) on experimental and computational research focusing on lexical representations.