I am a Research Fellow in the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan. My research is currently supported by a NSF project, Developing Evidence-based Data Sharing and Archiving Policies, where I am analyzing curation activities, automatically detecting data citations, and contributing to metrics for tracking the impact of data reuse. I hold a Ph.D. in Geography from UC Santa Barbara and I have expertise in GIScience, spatial information science, and urban planning. My interests also include the Semantic Web, innovative GIS education, and the science of science. I have experience deploying geospatial applications, designing linked data models, and developing visualizations to support data discovery.
Transportation is the backbone of the urban mobility system and is one of the greatest sources of environmental emissions and pollutions. Making urban transportation efficient, equitable and sustainable is the main focus of my research. My students and I analyze small scale survey data as well as large scale spatiotemporal data to identify travel behavior trends and patterns at a disaggregate level using econometric methods, which we then scale up to the population level through predictive and statistical modeling. We also design our own data collection methods and instruments, be it a network of smart devices or stated preference experiments. Our expertise lies in identifying latent constructs that influence decisions and choices, which in turn dictate demands on the systems and subsystems. We use our expertise to design incentives and policy suggestions that can help promote sustainable and equitable multimodal transportation systems. Our team also uses data analytics, particularly classification and pattern recognition algorithms, to analyze crash context data and develop safety-critical scenarios for automated and connected vehicle (CAV) deployment. We have developed an online game based on such scenarios to promote safe shared mobility among teenagers and young adults and plan to expand research in that area. We are also currently expanding our research to explore the use of NN in context information synthesis.
This is a project where we used classification and Bayesian models to identify scenarios that are risky for pedestrians and bicyclists. We then developed an online game based on those scenarios for middle schoolers so that they are better prepared for shared road conflicts.
I have broad interests and expertise in developing statistical methodology and applying it in biomedical research. I have adapted methodologies, including Bayesian data analysis, categorical data analysis, generalized linear models, longitudinal data analysis, multivariate analysis, RNA-Seq data analysis, survival data analysis and machine learning methods, in response to the unique needs of individual studies and objectives without compromising the integrity of the research and results. Two main methods recently developed:
1) A risk prediction model for a survival outcome using predictors of a large dimension
I have develop a simple, fast yet sufficiently flexible statistical method to estimate the updated risk of renal disease over time using longitudinal biomarkers of a high dimension. The goal is to utilize all sources of data of a large dimension (e.g., routine clinical features, urine and serum markers measured at baseline and all follow-up time points) to efficiently and accurately estimate the updated ESRD risk.
2) A safety mining tool for vaccine safety study
I developed an algorithm for vaccine safety surveillance while incorporating adverse event ontology. Multiple adverse events may individually be rare enough to go undetected, but if they are related, they can borrow strength from each other to increase the chance of being flagged. Furthermore, borrowing strength induces shrinkage of related AEs, thereby also reducing headline-grabbing false positives.
Andrea Thomer is an assistant professor of information at the University of Michigan School of Information. She conducts research in the areas of data curation, museum informatics, earth science and biodiversity informatics, information organization, and computer supported cooperative work. She is especially interested in how people use and create data and metadata; the impact of information organization on information use; issues of data provenance, reproducibility, and integration; and long-term data curation and infrastructure sustainability. She is studying a number of these issues through the “Migrating Research Data Collections” project – a recently awarded Laura Bush 21st Century Librarianship Early Career Research Grant from the Institute of Museum and Library Services. Dr. Thomer received her doctorate in Library and Information Science from the School of Information Sciences at the University of Illinois at Urbana‐Champaign in 2017.
My laboratory data science research includes: (1) Ontology development. We have initiated and led the development of several community-based ontologies, including Vaccine Ontology (VO), Ontology of Adverse Events (OAE), Cell Line Ontology (CLO), Ontology of Genes and Genomes (OGG), and Interaction Network Ontology (INO). (2) Ontology tool development. We have developed many ontology software programs, such as OntoFox and Ontobee, which are widely used for ontology reuse, ontology development, and ontology applications. (3) Literature mining, with a focus on ontology-based literature mining approaches. (4) Bayesian network (BN) modeling for analysis of gene interaction networks. Meanwhile, we have applied these ontologies, ontology-related approaches, and BN modeling in different data science domains including vaccinology, microbiology, immunology, and pharmacovigilance.