As a board-certified ophthalmologist and glaucoma specialist, I have more than 15 years of clinical experience caring for patients with different types and complexities of glaucoma. In addition to my clinical experience, as a health services researcher, I have developed experience and expertise in several disciplines including performing analyses using large health care claims databases to study utilization and outcomes of patients with ocular diseases, racial and other disparities in eye care, associations between systemic conditions or medication use and ocular diseases. I have learned the nuances of various data sources and ways to maximize our use of these data sources to answer important and timely questions. Leveraging my background in HSR with new skills in bioinformatics and precision medicine, over the past 2-3 years I have been developing and growing the Sight Outcomes Research Collaborative (SOURCE) repository, a powerful tool that researchers can tap into to study patients with ocular diseases. My team and I have spent countless hours devising ways of extracting electronic health record data from Clarity, cleaning and de-identifying the data, and making it linkable to ocular diagnostic test data (OCT, HVF, biometry) and non-clinical data. Now that we have successfully developed such a resource here at Kellogg, I am now collaborating with colleagues at > 2 dozen academic ophthalmology departments across the country to assist them with extracting their data in the same format and sending it to Kellogg so that we can pool the data and make it accessible to researchers at all of the participating centers for research and quality improvement studies. I am also actively exploring ways to integrate data from SOURCE into deep learning and artificial intelligence algorithms, making use of SOURCE data for genotype-phenotype association studies and development of polygenic risk scores for common ocular diseases, capturing patient-reported outcome data for the majority of eye care recipients, enhancing visualization of the data on easy-to-access dashboards to aid in quality improvement initiatives, and making use of the data to enhance quality of care, safety, efficiency of care delivery, and to improve clinical operations. .
My research focuses on issues in data collection with hard-to-reach populations. In particular, she examines 1) nontraditional sampling approaches for minority or stigmatized populations and their statistical properties and 2) measurement error and comparability issues for racial, ethnic and linguistic minorities, which also have implications for cross-cultural research/survey methodology. Most recently, my research has been dedicated to respondent driven sampling that uses existing social networks to recruit participants in both face-to-face and Web data collection settings. I plan to expand my research scope in examining representation issues focusing on the racial/ethnic minority groups in the U.S. in the era of big data.
Dr. Niccolò Meneghetti is an Assistant Professor of Computer and Information Science at the University of Michigan-Dearborn.
His major research interests are in the broad area of database systems, with primary focus on probabilistic databases, statistical relational learning and uncertain data management.
I am interested in the evolutionary processes that originate “mega-diverse” biotic assemblages and the role of ecology in shaping the evolution of diversity. My program studies the evolution of Neotropical freshwater fishes, the most diverse freshwater fish fauna on earth, with an estimate exceeding 7,000 species. My lab combines molecular phylogenetics and phylogeny-based comparative methods to integrate ecology, functional morphology, life histories and geography into analyses of macroevolutionary patterns of freshwater fish diversification. We are also comparing patterns of diversification across major Neotropical fish clades. Relying on fieldwork and natural history collections, we use methods that span
Andrea Thomer is an assistant professor of information at the University of Michigan School of Information. She conducts research in the areas of data curation, museum informatics, earth science and biodiversity informatics, information organization, and computer supported cooperative work. She is especially interested in how people use and create data and metadata; the impact of information organization on information use; issues of data provenance, reproducibility, and integration; and long-term data curation and infrastructure sustainability. She is studying a number of these issues through the “Migrating Research Data Collections” project – a recently awarded Laura Bush 21st Century Librarianship Early Career Research Grant from the Institute of Museum and Library Services. Dr. Thomer received her doctorate in Library and Information Science from the School of Information Sciences at the University of Illinois at Urbana‐Champaign in 2017.
The long temporal and large spatial scales of ecological systems make controlled experimentation difficult and the amassing of informative data challenging and expensive. The resulting sparsity and noise are major impediments to scientific progress in ecology, which therefore depends on efficient use of data. In this context, it has in recent years been recognized that the onetime playthings of theoretical ecologists, mathematical models of ecological processes, are no longer exclusively the stuff of thought experiments, but have great utility in the context of causal inference. Specifically, because they embody scientific questions about ecological processes in sharpest form—making precise, quantitative, testable predictions—the rigorous confrontation of process-based models with data accelerates the development of ecological understanding. This is the central premise of my research program and the common thread of the work that goes on in my laboratory.
Michael Cafarella, PhD, is Associate Professor of Electrical Engineering and Computer Science, College of Engineering and Faculty Associate, Survey Research Center, Institute for Social Research, at the University of Michigan, Ann Arbor.
Prof. Cafarella’s research focuses on data management problems that arise from extreme diversity in large data collections. Big data is not just big in terms of bytes, but also type (e.g., a single hard disk likely contains relations, text, images, and spreadsheets) and structure (e.g., a large corpus of relational databases may have millions of unique schemas). As a result, certain long-held assumptions — e.g., that the database schema is always known before writing a query — are no longer useful guides for building data management systems. As a result, my work focuses heavily on information extraction and data mining methods that can either improve the quality of existing information or work in spite of lower-quality information.
The basis of my work is to make the often invisible traces created by interactions students have with learning technologies available to instructors, technology solutions, and students themselves. This often requires the creation of new novel educational technologies which are designed from the beginning with detailed tracking of user activities. Coupled with machine learning and data mining techniques (e.g. classification, regression, and clustering methods), clickstream data from these technologies is used to build predictive models of student success and to better understand how technology affords benefits in teaching and learning. I’m interested in broadly scaled teaching and learning through Massive Open Online Courses (MOOCs), how predictive models can be used to understand student success, and the analysis of educational discourse and student writing.
Jerome P. Lynch, PhD, is Professor and Donald Malloure Department Chair of the Civil and Environmental Engineering Department in the College of Engineering in the University of Michigan, Ann Arbor.
Prof. Lynch’s group works at the forefront of deploying large-scale sensor networks to the built environment for monitoring and control of civil infrastructure systems including bridges, roads, rail networks, and pipelines; this research portfolio falls within the broader class of cyber-physical systems (CPS). To maximize the benefit of the massive data sets, they collect from operational infrastructure systems, and undertake research in the area of relational and NoSQL database systems, cloud-based analytics, and data visualization technologies. In addition, their algorithmic work is focused on the use of statistical signal processing, pattern classification, machine learning, and model inversion/updating techniques to automate the interrogation sensor data collected. The ultimate aim of Prof. Lynch’s work is to harness the full potential of data science to provide system users with real-time, actionable information obtained from the raw sensor data collected.
Building data-intensive systems that are more scalable, more robust, and more predictable. He draws from advanced statistical models to deliver practical database solutions to real-world problems. In particular, he adapts concepts and tools from applied statistics, optimization theory, and machine learning.