Samuel K Handelman, Ph.D., is Research Assistant Professor in the department of Internal Medicine, Gastroenterology, of Michigan Medicine at the University of Michigan, Ann Arbor. Prof. Handelman is focused on multiomics approaches to drive precision/personalizedtherapy and to predict populationlevel differences in the effectiveness of interventions. He tends to favor regressionstyle and hierarchicalclustering approaches, partially because he has a background in both statistics and in cladistics. His scientific monomania is for compensatory mechanisms and tradeoffs in evolution, but he has a principled reason to focus on translational medicine: real understanding of these mechanisms goes all the way into the clinic. Anything less that clinical translation indicates that we don’t understand what drove the genetics of human populations.
Prof. Titiunik’s research interests lie primarily in quantitative methodology for the social sciences, with emphasis on quasiexperimental methods for causal inference and political methodology. She is particularly interested in the application and development of nonexperimental methods for the study of political institutions, a methodological agenda that is motivated by her substantive interests on democratic accountability and the role of party systems in developing democracies. Some of her current projects include the application of web scraping and text analysis tools to measure political phenomena.
Zhenke Wu is an Assistant Professor of Biostatistics, and a core faculty member in the Michigan Institute of Data Science (MIDAS). He received his Ph.D. in Biostatistics from the Johns Hopkins University in 2014 and then stayed at Hopkins for his postdoctoral training before joining the University of Michigan. Dr. Wu’s research focuses on the design and application of statistical methods that inform health decisions made by individuals, or precision medicine. The original methods and software developed by Dr. Wu are now used by investigators from research institutes such as CDC and Johns Hopkins, as well as site investigators from developing countries, e.g., Kenya, South Africa, Gambia, Mali, Zambia, Thailand and Bangladesh.
Profile: At a “sweet spot” of data science
By Dan Meisler
Communications Manager, ARC
If you had to name two of the more exciting, emerging fields of data science, electronic health records (EHR) and mobile health might be near the top of the list.
Zhenke Wu, one of the newest MIDAS core faculty members, has one foot firmly in each field.
“These two fields share the common goal of learning from the experience of the population in the past to advance health and clinical decisions for those to follow. I am looking forward to more work that will bring the two fields closer to continuously generate insights about human health.” Wu said. “I’m in a sweet spot.”
Wu joined UM in Fall 2016, after earning a PhD in Biostatistics from Johns Hopkins University, and a bachelor’s in Mathematics from Fudan University. He said the multitude of largescale studies going on at UM and access to EHR databases were factors in his coming to Michigan.
“The University of Michigan is an exciting place that has a diversity of largescale databases and supportive research groups in the fields I’m interested in,” he said.
Wu is collaborating with the Michigan Genomics Initiative, which is a biorepository effort at Michigan Medicine to integrate genomewide information with EHR from approximately 40,000 patients undergoing anesthesia prior to surgery or diagnostic procedures. He’s also collaborating with Dr. Srijan Sen, Associate Professor, Department of Psychiatry and Molecular and Behavioral Neuroscience Institute, on the MIDASsupported project “Identifying RealTime Data Predictors of Stress and Depression Using Mobile Technology,” the preliminary results of which recently matured into an NIHfunded R01 project “Mobile Technology to Identify Mechanisms Linking Genetic Variation and Depression” that will draw broad expertise from a multidisciplinary team of medical and data science researchers.
“One of my goals is to use an integrated and rigorous approach to predict how a person’s health status will be in the near future,” Wu said.
Wu applies hierarchical Bayesian models to these problems, which he hopes will shed light on phenomena he describes as latent constructs that are “wellknown, but less quantitatively understood, e.g., intelligence quotient (IQ) in psychology.”
As another example, he cites the current challenge in active surveillance of prostate cancer patients for aggressive tumors requiring removal and/or radiation, or indolent tumors permitting continued surveillance.
“The underlying status of aggressive versus indolent cancer is not observed, which needs to be learned from the results of biopsy and other clinical measurements,” he said. “The decisions and experience of urologists and their patients will greatly benefit from more accurate understanding of the tumor status… There are lots of scientific problems in clinical, biomedical, behavioral and social sciences where you have wellknown but less quantitatively understood latent constructs. These are problems that Bayesian latent variable methods can formulate and address.”
Just as Wu has a hand in two hotbutton big data areas, he also sees himself as straddling the line between application and methodology.
He says the large number of data sources — sensors, mobile apps, test results, and questionnaires, to name just a few — results in richness as well as some “messiness” that needs new methodologies to adjust, integrate and translate to new scientific insights. At the same time, a valid new methodology for dealing with, for example, electronic health data, will likely find numerous different applications.
Wu says his approach was heavily influenced by his work in the Pneumonia Etiology Research for Child Health (PERCH) funded by the Gates Foundation while he was at Johns Hopkins. Pneumonia is a clinical syndrome due to lung infection that can be caused by more than 30 different species of pathogens, including bacteria, viruses and fungi. The goal of the sevencountry study that enrolled more than 5,000 cases and 5,000 controls from Africa and Southeast Asia is to estimate the frequency with which each pathogen caused pneumonia in the population and the probability of each individual being infected by the list of pathogens in the lung.
“In most settings, it is extremely difficult to identify the pathogen by directly sampling from the site of infection – the child’s lung. PERCH therefore looked for other sources of evidence by standardizing and comprehensively testing biofluids collected from sites peripheral to the lung. Using hierarchical Bayesian models to infer disease etiology by integrating such a large trove of data was extremely fun and exciting”, he said.
Wu’s initial interest in math, leading to biostatistics and now data science, stems from what he called a “greedy” desire to learn the guiding principles of how the world works by rigorous data science.
“If you have new problems, you can wait for other people to ask a clean math question, or you can go work with these messy problems and figure out interesting questions and their answers,” he said.
For more on Dr. Wu, see his profile on Michigan Experts.
Recent publications
From experts.umich.edu.

Nested partially latent class models for dependent binary data; Estimating disease etiology
on April 1, 2017 at 12:00 am
Nested partially latent class models for dependent binary data; Estimating disease etiologyWu, Z., DeloriaKnoll, M. & Zeger, S. L. Apr 1 2017 In : Biostatistics. 18, 2, p. 200213 14 p.Research output: Contribution to journal › Artic […]

Bayesian estimation of pneumonia etiology: Epidemiologic considerations and applications to the pneumonia etiology research for child health study
on January 1, 2017 at 12:00 am
Bayesian estimation of pneumonia etiology: Epidemiologic considerations and applications to the pneumonia etiology research for child health studyKnoll, M. D. , Fu, W. , Shi, Q. , Prosperi, C. , Wu, Z. , Hammitt, L. L. , Feikin, D. R. , Baggett, H. C. , Howie, S. R. C. , Scott, J. A. G. , Murdoch, D. R. , Madhi, S. A. , Thea, D. M. , Brooks, W. A. , Kotloff, K. L. , Li, M. , Park, D. E. , Lin, W. , Levine, O. S. , O'Brien, K. L. & 1 others Zeger, S. L. Jan 1 2017 In : Clinical Infectious Diseases. 64, p. S213S227Research output: Contribution to journal › Artic […]

Partially latent class models for casecontrol studies of childhood pneumonia aetiology
on January 1, 2016 at 12:00 am
Partially latent class models for casecontrol studies of childhood pneumonia aetiologyWu, Z., DeloriaKnoll, M., Hammitt, L. L. & Zeger, S. L. Jan 1 2016 In : Journal of the Royal Statistical Society. Series C: Applied Statistics. 65, 1, p. 97114 18 p.Research output: Contribution to journal › Artic […]
Yang Chen received her Ph.D. (2017) in Statistics from Harvard University and then joined the University of Michigan as an Assistant Professor of Statistics and Research Assistant Professor at the Michigan Institute of Data Science (MIDAS). She received her B.A. in Mathematics and Applied Mathematics from the University of Science and Technology of China. Research interests include computational algorithms in statistical inference and applied statistics in the field of biology and astronomy.
Matias D. Cattaneo, Ph.D., is Professor of Economics and Statistics in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor.
Prof. Cattaneo’s research interests include econometric theory, mathematical statistics, and applied econometrics, with focus on causal inference, program evaluation, highdimensional problems and applied microeconomics. Most of his recent research relates to the development of new, improved semiparametric, nonparametric and highdimensional inference procedures exhibiting demonstrable superior robustness properties with respect to tuning parameter and other implementation choices. His work is motivated by concrete empirical problems in social, biomedical and statistical sciences, covering a wide array of topics in settings related to treatment effects and policy evaluation, highdimensional models, average derivatives and structural response functions, applied finance and applied decision theory, among others.
Sriram Chandrasekaran, PhD, is Assistant Professor of Biomedical Engineering in the College of Engineering at the University of Michigan, Ann Arbor.
Dr. Chandrasekaran’s Systems Biology lab develops computer models of biological processes to understand them holistically. Sriram is interested in deciphering how thousands of proteins work together at the microscopic level to orchestrate complex processes like embryonic development or cognition, and how this complex network breaks down in diseases like cancer. Systems biology software and algorithms developed by his lab are highlighted below and are available at http://www.sriramlab.org/software/.
– INDIGO (INferring Drug Interactions using chemoGenomics and Orthology) algorithm predicts how antibiotics prescribed in combinations will inhibit bacterial growth. INDIGO leverages genomics and druginteraction data in the model organism – E. coli, to facilitate the discovery of effective combination therapies in lessstudied pathogens, such as M. tuberculosis. (Ref: Chandrasekaran et al. Molecular Systems Biology 2016)
– GEMINI (Gene Expression and Metabolism Integrated for Network Inference) is a network curation tool. It allows rapid assessment of regulatory interactions predicted by highthroughput approaches by integrating them with a metabolic network (Ref: Chandrasekaran and Price, PloS Computational Biology 2013)
– ASTRIX (Analyzing Subsets of Transcriptional Regulators Influencing eXpression) uses gene expression data to identify regulatory interactions between transcription factors and their target genes. (Ref: Chandrasekaran et al. PNAS 2011)
– PROM (Probabilistic Regulation of Metabolism) enables the quantitative integration of regulatory and metabolic networks to build genomescale integrated metabolic–regulatory models (Ref: Chandrasekaran and Price, PNAS 2010)
Yuekai Sun, PhD, is Assistant Professor in the department of Statistics at the University of Michigan, Ann Arbor.
Dr. Sun’s research is motivated by the challenges of analyzing massive data sets in datadriven science and engineering. I focus on statistical methodology for highdimensional problems; i.e. problems where the number of unknown parameters is comparable to or exceeds the sample size. My recent work focuses on two problems that arise in learning from highdimensional data (versus blackbox approaches that do not yield insights into the underlying datageneration process). They are:
1. model selection and postselection inference: discover the latent lowdimensional structure in highdimensional data and perform inference on the learned structure;
2. distributed statistical computing: design scalable estimators and algorithms that avoid communication and minimize “passes” over the data.
A recurring theme in my work is exploiting the geometry of latent lowdimensional structure for statistical and computational gains. More broadly, I am interested in the geometric aspects of highdimensional data analysis.
The goal of my research is to leverage network analysis techniques to uncover how the brain mediates sex hormone influences on gendered behavior across the lifespan. Specifically, my data science research concerns the creation and application of personspecific connectivity analyses, such as unified structural equation models, to time series data; these are intensive longitudinal data, including functional neuroimages, daily diaries, and observations. I then use these data science methods to investigate the links between androgens (e.g., testosterone) and estradiol at key developmental periods, such as puberty, and behaviors that typically show sex differences, including aspects of cognition and psychopathology.
Kai S. Cortina, PhD, is Professor of Psychology in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor.
Prof. Cortina’s major research revolves around the understanding of children’s and adolescents’ pathways into adulthood and the role of the educational system in this process. The academic and psychosocial development is analyzed from a lifespan perspective exclusively analyzing longitudinal data over longer periods of time (e.g., from middle school to young adulthood). The hierarchical structure of the school system (student/classroom/school/district/state/nations) requires the use of statistical tools that can handle these kind of nested data.