Aditi Misra

By |

Transportation is the backbone of the urban mobility system and is one of the greatest sources of environmental emissions and pollutions. Making urban transportation efficient, equitable and sustainable is the main focus of my research. My students and I analyze small scale survey data as well as large scale spatiotemporal data to identify travel behavior trends and patterns at a disaggregate level using econometric methods, which we then scale up to the population level through predictive and statistical modeling. We also design our own data collection methods and instruments, be it a network of smart devices or stated preference experiments. Our expertise lies in identifying latent constructs that influence decisions and choices, which in turn dictate demands on the systems and subsystems. We use our expertise to design incentives and policy suggestions that can help promote sustainable and equitable multimodal transportation systems. Our team also uses data analytics, particularly classification and pattern recognition algorithms, to analyze crash context data and develop safety-critical scenarios for automated and connected vehicle (CAV) deployment. We have developed an online game based on such scenarios to promote safe shared mobility among teenagers and young adults and plan to expand research in that area. We are also currently expanding our research to explore the use of NN in context information synthesis.

This is a project where we used classification and Bayesian models to identify scenarios that are risky for pedestrians and bicyclists. We then developed an online game based on those scenarios for middle schoolers so that they are better prepared for shared road conflicts.

Lili Zhao

By |

I have broad interests and expertise in developing statistical methodology and applying it in biomedical research. I have adapted methodologies, including Bayesian data analysis, categorical data analysis, generalized linear models, longitudinal data analysis, multivariate analysis, RNA-Seq data analysis, survival data analysis and machine learning methods, in response to the unique needs of individual studies and objectives without compromising the integrity of the research and results. Two main methods recently developed:
1) A risk prediction model for a survival outcome using predictors of a large dimension
I have develop a simple, fast yet sufficiently flexible statistical method to estimate the updated risk of renal disease over time using longitudinal biomarkers of a high dimension. The goal is to utilize all sources of data of a large dimension (e.g., routine clinical features, urine and serum markers measured at baseline and all follow-up time points) to efficiently and accurately estimate the updated ESRD risk.
2) A safety mining tool for vaccine safety study
I developed an algorithm for vaccine safety surveillance while incorporating adverse event ontology. Multiple adverse events may individually be rare enough to go undetected, but if they are related, they can borrow strength from each other to increase the chance of being flagged. Furthermore, borrowing strength induces shrinkage of related AEs, thereby also reducing headline-grabbing false positives.

Andrea Thomer

By |

Andrea Thomer is an assistant professor of information at the University of Michigan School of Information. She conducts research in the areas of data curation, museum informatics, earth science and biodiversity informatics, information organization, and computer supported cooperative work. She is especially interested in how people use and create data and metadata; the impact of information organization on information use; issues of data provenance, reproducibility, and integration; and long-term data curation and infrastructure sustainability. She is studying a number of these issues through the “Migrating Research Data Collections” project – a recently awarded Laura Bush 21st Century Librarianship Early Career Research Grant from the Institute of Museum and Library Services. Dr. Thomer received her doctorate in Library and Information Science from the School of Information Sciences at the University of Illinois at Urbana‚ÄźChampaign in 2017.

Yongqun (Oliver) He

By |

My laboratory data science research includes: (1) Ontology development. We have initiated and led the development of several community-based ontologies, including Vaccine Ontology (VO), Ontology of Adverse Events (OAE), Cell Line Ontology (CLO), Ontology of Genes and Genomes (OGG), and Interaction Network Ontology (INO). (2) Ontology tool development. We have developed many ontology software programs, such as OntoFox and Ontobee, which are widely used for ontology reuse, ontology development, and ontology applications. (3) Literature mining, with a focus on ontology-based literature mining approaches. (4) Bayesian network (BN) modeling for analysis of gene interaction networks. Meanwhile, we have applied these ontologies, ontology-related approaches, and BN modeling in different data science domains including vaccinology, microbiology, immunology, and pharmacovigilance.

With ever increasing quantities of big data, how to integrate, share, and analyze these data has become a huge challenge. Hundreds of biological interaction pathway resources are publicly available. While each of these resources is widely used, the data in these resources are typically overlapped but not integrated. This disintegration results in redundant work and inefficient data usages. An ontology is a human- and computer-interpretable set of terms and relations that represent entities in a specific domain and how these terms relate to each other. As part of a funded MCubed Diamond project, we aim to ontologically and non-redundantly represent and integrate various molecular interactions, pathways, and networks. The integrated ontology of interaction pathways and networks will then be used by novel statistical and computational methods to efficiently address various scientific problems.

With ever increasing quantities of big data, how to integrate, share, and analyze these data has become a huge challenge. Hundreds of biological interaction pathway resources are publicly available. While each of these resources is widely used, the data in these resources are typically overlapped but not integrated. This disintegration results in redundant work and inefficient data usages. An ontology is a human- and computer-interpretable set of terms and relations that represent entities in a specific domain and how these terms relate to each other. As part of a funded MCubed Diamond project, we aim to ontologically and non-redundantly represent and integrate various molecular interactions, pathways, and networks. The integrated ontology of interaction pathways and networks will then be used by novel statistical and computational methods to efficiently address various scientific problems.