Explore ARCExplore ARC

Yongsheng Bai

By |

Dr. Bai’s research interests lie in development and refinement of bioinformatics algorithms/software and databases on next-generation sequencing (NGS data), development of statistical model for solving biological problems, bioinformatics analysis of clinical data, as well as other topics including, but not limited to, uncovering disease genes and variants using informatics approaches, computational analysis of cis-regulation and comparative motif finding, large-scale genome annotation, comparative “omics”, and evolutionary genomics.

Ginger Shultz

By |

The Shultz group uses data science methods in two primary ways 1) to investigate student placement in introductory chemistry courses and 2) to analyze student texts to provide instructors actionable intelligence about student learning. Using regression discontinuity we investigated the impact of taking general chemistry prior to organic chemistry on student performance and persistence in later chemistry courses and found that students who took general chemistry first benefitted by 1/4 of a letter grade but were no more likely to persist. A continued investigation using survey and interview methods indicated that this was related to academic skills rather than content preparation. Through the MWrite project we have collected a large corpus of student texts and are developing automated text analysis methods to glean information about student learning across disciplines, with specific focus on scientific reasoning.

Network representation of writing moves made by students in argumentative writing with relevant transition probabilities. The size of the node represents the relative frequency of operation use and the edge labels represent the transition probability with key transitions highlighted in orange.

Hyun Min Kang

By |

Hyun Min Kang is an Associate Professor in the Department of Biostatistics. He received his Ph.D. in Computer Science from University of California, San Diego in 2009 and joined the University of Michigan faculty in the same year. Prior to his doctoral studies, he worked as a research fellow at the Genome Research Center for Diabetes and Endocrine Disease in the Seoul National University Hospital for a year and a half, after completing his Bachelors and Masters degree in Electrical Engineering at Seoul National University. His research interest lies in big data genome science. Methodologically, his primary focus is on developing statistical methods and computational tools for large-scale genetic studies. Scientifically, his research aims to understand the etiology of complex disease traits, including type 2 diabetes, bipolar disorder, cardiovascular diseases, and glomerular diseases.

Veera Baladandayuthapani

By |

Dr. Veera Baladandayuthapani is currently a Professor in the Department of Biostatistics at University of Michigan (UM), where he is also the Associate Director of the Center for Cancer Biostatistics. He joined UM in Fall 2018 after spending 13 years in the Department of Biostatistics at University of Texas MD Anderson Cancer Center, Houston, Texas, where was a Professor and Institute Faculty Scholar and held adjunct appointments at Rice University, Texas A&M University and UT School of Public Health. His research interests are mainly in high-dimensional data modeling and Bayesian inference. This includes functional data analyses, Bayesian graphical models, Bayesian semi-/non-parametric models and Bayesian machine learning. These methods are motivated by large and complex datasets (a.k.a. Big Data) such as high-throughput genomics, epigenomics, transcriptomics and proteomics as well as high-resolution neuro- and cancer- imaging. His work has been published in top statistical/biostatistical/bioinformatics and biomedical/oncology journals. He has also co-authored a book on Bayesian analysis of gene expression data. He currently holds multiple PI-level grants from NIH and NSF to develop innovative and advanced biostatistical and bioinformatics methods for big datasets in oncology. He has also served as the Director of the Biostatistics and Bioinformatics Cores for the Specialized Programs of Research Excellence (SPOREs) in Multiple Myeloma and Lung Cancer and Biostatistics&Bioinformatics platform leader for the Myeloma and Melanoma Moonshot Programs at MD Anderson. He is a fellow of the American Statistical Association and an elected member of the International Statistical Institute. He currently serves as an Associate Editor for Journal of American Statistical Association, Biometrics and Sankhya.


An example of horizontal (across cancers) and vertical (across multiple molecular platforms) data integration. Image from Ha et al (Nature Scientific Reports, 2018; https://www.nature.com/articles/s41598-018-32682-x)

Oleg Gnedin

By |

I am a theoretical astrophysicist studying the origins and structure of galaxies in the universe. My research focuses on developing more realistic gasdynamics simulations, starting with the initial conditions that are well constrained by observations, and advancing them in time with high spatial resolution using adaptive mesh refinement. I use machine-learning techniques to compare simulation predictions with observational data. Such comparison leads to insights about the underlying physics that governs the formation of stars and galaxies. I have developed a Computational Astrophysics course that teaches practical application of modern techniques for big-data analysis and model fitting.

Emergence of galaxies and star clusters in cosmological gasdynamics simulations. Left panel shows large-scale cosmic structure (density of dark matter particles), which formed by gravitational instability. In the middle panel we can resolve this structure into disk galaxies with complex morphology (density of molecular/red and atomic/blue gas). These galaxies should create massive star clusters, such as shown in the right panel (real image — to be reproduced by our future simulations!).

Xun Huan

By |

Prof. Huan’s research broadly revolves around uncertainty quantification, data-driven modeling, and numerical optimization. He focuses on methods to bridge together models and data: e.g., optimal experimental design, Bayesian statistical inference, uncertainty propagation in high-dimensional settings, and algorithms that are robust to model misspecification. He seeks to develop efficient numerical methods that integrate computationally-intensive models with big data, and combine uncertainty quantification with machine learning to enable robust and reliable prediction, design, and decision-making.

Optimal experimental design seeks to identify experiments that produce the most valuable data. For example, when designing a combustion experiment to learn chemical kinetic parameters, design condition A maximizes the expected information gain. When Bayesian inference is performed on data from this experiment, we indeed obtain “tighter” posteriors (with less uncertainty) compared to those obtained from suboptimal design conditions B and C.

James R. Hines Jr.

By |

Professor Hines’ research focuses on the analysis of the donative behavior of Americans, and how it affects the intergenerational and interpersonal transmission of economic well-being. To what extent do parents leave property to their children and others, and how is this behavior affected by legal institutions, taxes, social norms, and other considerations? While there are no comprehensive sources of data on wills, trusts, lifetime gifts, and other forms of property transmission, there is ample available information from legal documents that with the help of natural language processing can hopefully be coded and analyzed in a systematic way.

Fred Feng

By |

Dr. Feng’s research involves conducting and using naturalistic observational studies to better understand the interactions between motorists and other road users including bicyclists and pedestrians. The goal is to use an evidence-based, data-driven approach that improves bicycling and walking safety and ultimately makes them viable mobility options. A naturalistic study is a valuable and unique research method that provides continuous, high-time-resolution, rich, and objective data about how people drive/ride/walk for their everyday trips in the real world. It also faces challenges from the sheer volume of the data, and as with all observational studies, there are potential confounding factors compared to a randomized laboratory experiment. Data analytic methods can be developed to interpret the behavioral data, make meaningful inferences, and get actionable insights.

Using naturalistic driving data to examine the interactions between motorists and bicyclists

Nicholson Price

By |

I study how law shapes innovation in the life sciences, with a substantial focus on big data and artificial intelligence in medicine. I write about the intellectual property incentives and protections for data and AI algorithms, the privacy issues with wide-scale health- and health-related data collection, the medical malpractice implications of AI in medicine, and how FDA should regulate the use of medical AI.

Neda Masoud

By |

The future of transportation lies at the intersection of two emerging trends, namely, the sharing economy and connected and automated vehicle technology. Our research group investigates the impact of these two major trends on the future of mobility, quantifying the benefits and identifying the challenges of integrating these technologies into our current systems.

Our research on shared-use mobility systems focuses on peer-to-peer (P2P) ridesharing and multi-modal transportation. We provide: (i) operational tools and decision support systems for shared-use mobility in legacy as well as connected and automated transportation systems. This line of research focuses on system design as well as routing, scheduling, and pricing mechanisms to serve on-demand transportation requests; (ii) insights for regulators and policy makers on mobility benefits of multi-modal transportation; (ii) planning tools that would allow for informed regulations of sharing economy.

In another line of research we investigate challenges faced by the connected automated vehicle technology before mass adoption of this technology can occur. Our research mainly focuses on (i) transition of control authority between the human driver and the autonomous entity in semi-autonomous (level 3 SAE autonomy) vehicles; (ii) incorporating network-level information supplied by connected vehicle technology into traditional trajectory planning; (iii) improving vehicle localization by taking advantage of opportunities provided by connected vehicles; and (iv) cybersecurity challenges in connected and automated systems. We seek to quantify the mobility and safety implications of this disruptive technology, and provide insights that can allow for informed regulations.