Explore ARCExplore ARC

Ho-Joon Lee

By |

Dr. Lee’s research in data science concerns biological questions in systems biology and network medicine by developing algorithms and models through a combination of statistical/machine learning, information theory, and network theory applied to multi-dimensional large-scale data. His projects have covered genomics, transcriptomics, proteomics, and metabolomics from yeast to mouse to human for integrative analysis of regulatory networks on multiple molecular levels, which also incorporates large-scale public databases such as GO for functional annotation, PDB for molecular structures, and PubChem and LINCS for drugs or small compounds. He previously carried out proteomics and metabolomics along with a computational derivation of dynamic protein complexes for IL-3 activation and cell cycle in murine pro-B cells (Lee et al., Cell Reports 2017), for which he developed integrative analytical tools using diverse approaches from machine learning and network theory. His ongoing interests in methodology include machine/deep learning and topological Kolmogorov-Sinai entropy-based network theory, which are applied to (1) multi-level dynamic regulatory networks in immune response, cell cycle, and cancer metabolism and (2) mass spectrometry-based omics data analysis.

Figure 1. Proteomics and metabolomics analysis of IL-3 activation and cell cycle (Lee et al., Cell Reports 2017). (A) Multi-omics abundance profiles of proteins, modules/complexes, intracellular metabolites, and extracellular metabolites over one cell cycle (from left to right columns) in response to IL-3 activation. Red for proteins/modules/intracellular metabolites up-regulation or extracellular metabolites release; Green for proteins/modules/intracellular metabolites down-regulation or extracellular metabolites uptake. (B) Functional module network identified from integrative analysis. Red nodes are proteins and white nodes are functional modules. Expression profile plots are shown for literature-validated functional modules. (C) Overall pathway map of IL-3 activation and cell cycle phenotypes. (D) IL-3 activation and cell cycle as a cancer model along with candidate protein and metabolite biomarkers. (E) Protein co-expression scale-free network. (F) Power-low degree distribution of the network E. (G) Protein entropy distribution by topological Kolmogorov-Sinai entropy calculated for the network E.

 

Samuel K Handelman

By |

Samuel K Handelman, Ph.D., is Research Assistant Professor in the department of Internal Medicine, Gastroenterology, of Michigan Medicine at the University of Michigan, Ann Arbor. Prof. Handelman is focused on multi-omics approaches to drive precision/personalized-therapy and to predict population-level differences in the effectiveness of interventions. He tends to favor regression-style and hierarchical-clustering approaches, partially because he has a background in both statistics and in cladistics. His scientific monomania is for compensatory mechanisms and trade-offs in evolution, but he has a principled reason to focus on translational medicine: real understanding of these mechanisms goes all the way into the clinic. Anything less that clinical translation indicates that we don’t understand what drove the genetics of human populations.

Zhenke Wu

By |

Zhenke Wu is an Assistant Professor of Biostatistics, and a core faculty member in the Michigan Institute of Data Science (MIDAS). He received his Ph.D. in Biostatistics from the Johns Hopkins University in 2014 and then stayed at Hopkins for his postdoctoral training before joining the University of Michigan. Dr. Wu’s research focuses on the design and application of statistical methods that inform health decisions made by individuals, or precision medicine. The original methods and software developed by Dr. Wu are now used by investigators from research institutes such as CDC and Johns Hopkins, as well as site investigators from developing countries, e.g., Kenya, South Africa, Gambia, Mali, Zambia, Thailand and Bangladesh.

 

Profile: At a “sweet spot” of data science

By Dan Meisler
Communications Manager, ARC

If you had to name two of the more exciting, emerging fields of data science, electronic health records (EHR) and mobile health might be near the top of the list.

Zhenke Wu, one of the newest MIDAS core faculty members, has one foot firmly in each field.

“These two fields share the common goal of learning from the experience of the population in the past to advance health and clinical decisions for those to follow. I am looking forward to more work that will bring the two fields closer to continuously generate insights about human health.” Wu said. “I’m in a sweet spot.”

Wu joined U-M in Fall 2016, after earning a PhD in Biostatistics from Johns Hopkins University, and a bachelor’s in Mathematics from Fudan University. He said the multitude of large-scale studies going on at U-M and access to EHR databases were factors in his coming to Michigan.

“The University of Michigan is an exciting place that has a diversity of large-scale databases and supportive research groups in the fields I’m interested in,” he said.

Wu is collaborating with the Michigan Genomics Initiative, which is a biorepository effort at Michigan Medicine to integrate genome-wide information with EHR from approximately 40,000 patients undergoing anesthesia prior to surgery or diagnostic procedures. He’s also collaborating with Dr. Srijan Sen, Associate Professor, Department of Psychiatry and Molecular and Behavioral Neuroscience Institute, on the MIDAS-supported project “Identifying Real-Time Data Predictors of Stress and Depression Using Mobile Technology,” the preliminary results of which recently matured into an NIH-funded R01 project “Mobile Technology to Identify Mechanisms Linking Genetic Variation and Depression” that will draw broad expertise from a multi-disciplinary team of medical and data science researchers.

A visualization of data from the Michigan Genomics Initiative

“One of my goals is to use an integrated and rigorous approach to predict how a person’s health status will be in the near future,” Wu said.

Wu applies hierarchical Bayesian models to these problems, which he hopes will shed light on phenomena he describes as latent constructs that are “well-known, but less quantitatively understood, e.g., intelligence quotient (IQ) in psychology.”

As another example, he cites the current challenge in active surveillance of prostate cancer patients for aggressive tumors requiring removal and/or radiation, or indolent tumors permitting continued surveillance.

“The underlying status of aggressive versus indolent cancer is not observed, which needs to be learned from the results of biopsy and other clinical measurements,” he said. “The decisions and experience of urologists and their patients will greatly benefit from more accurate understanding of the tumor status… There are lots of scientific problems in clinical, biomedical, behavioral and social sciences where you have well-known but less quantitatively understood latent constructs. These are problems that Bayesian latent variable methods can formulate and address.”

Just as Wu has a hand in two hot-button big data areas, he also sees himself as straddling the line between application and methodology.

He says the large number of data sources — sensors, mobile apps, test results, and questionnaires, to name just a few — results in richness as well as some “messiness” that needs new methodologies to adjust, integrate and translate to new scientific insights. At the same time, a valid new methodology for dealing with, for example, electronic health data, will likely find numerous different applications.

Wu says his approach was heavily influenced by his work in the Pneumonia Etiology Research for Child Health (PERCH) funded by the Gates Foundation while he was at Johns Hopkins. Pneumonia is a clinical syndrome due to lung infection that can be caused by more than 30 different species of pathogens, including bacteria, viruses and fungi. The goal of the seven-country study that enrolled more than 5,000 cases and 5,000 controls from Africa and Southeast Asia is to estimate the frequency with which each pathogen caused pneumonia in the population and the probability of each individual being infected by the list of pathogens in the lung.

“In most settings, it is extremely difficult to identify the pathogen by directly sampling from the site of infection – the child’s lung. PERCH therefore looked for other sources of evidence by standardizing and comprehensively testing biofluids collected from sites peripheral to the lung. Using hierarchical Bayesian models to infer disease etiology by integrating such a large trove of data was extremely fun and exciting”, he said.

Wu’s initial interest in math, leading to biostatistics and now data science, stems from what he called a “greedy” desire to learn the guiding principles of how the world works by rigorous data science.

“If you have new problems, you can wait for other people to ask a clean math question, or you can go work with these messy problems and figure out interesting questions and their answers,” he said.

For more on Dr. Wu, see his profile on Michigan Experts.

Recent publications

From experts.umich.edu.

    Danny Forger

    By |

    Daniel Forger is a Professor in the Department of Mathematics. He is devoted to understanding biological clocks. He uses techniques from many fields, including computer simulation, detailed mathematical modeling and mathematical analysis, to understand biological timekeeping. His research aims to generate predictions that can be experimentally verified.

    Bhramar Mukherjee

    By |

    Bhramar Mukherjee is  a Professor in the Department of Biostatistics, joining the department in Fall, 2006. Bhramar is also a Professor in the Department of Epidemiology. Bhramar completed her Ph.D. in 2001 from Purdue University. Bhramar’s principal research interests lie in Bayesian methods in epidemiology and studies of gene-environment interaction. She is also interested in modeling missingness in exposure, categorical data models, Bayesian nonparametrics, and the general area of statistical inference under outcome/exposure dependent sampling schemes. Bhramar’s methodological research is funded by NSF and NIH.   Bhramar is involved as a co-investigator in several R01s led by faculty in Internal Medicine, Epidemiology and Environment Health sciences at UM. Her collaborative interests focus on genetic and environmental epidemiology, ranging from investigating the genetic architecture of colorectal cancer in relation to environmental exposures to studies of air pollution on pediatric Asthma events in Detroit. She is actively engaged in Global Health Research.

    Sriram Chandrasekaran

    By |

    Sriram Chandrasekaran, PhD, is Assistant Professor of Biomedical Engineering in the College of Engineering at the University of Michigan, Ann Arbor.

    Dr. Chandrasekaran’s Systems Biology lab develops computer models of biological processes to understand them holistically. Sriram is interested in deciphering how thousands of proteins work together at the microscopic level to orchestrate complex processes like embryonic development or cognition, and how this complex network breaks down in diseases like cancer. Systems biology software and algorithms developed by his lab are highlighted below and are available at http://www.sriramlab.org/software/.

    – INDIGO (INferring Drug Interactions using chemoGenomics and Orthology) algorithm predicts how antibiotics prescribed in combinations will inhibit bacterial growth. INDIGO leverages genomics and drug-interaction data in the model organism – E. coli, to facilitate the discovery of effective combination therapies in less-studied pathogens, such as M. tuberculosis. (Ref: Chandrasekaran et al. Molecular Systems Biology 2016)

    – GEMINI (Gene Expression and Metabolism Integrated for Network Inference) is a network curation tool. It allows rapid assessment of regulatory interactions predicted by high-throughput approaches by integrating them with a metabolic network (Ref: Chandrasekaran and Price, PloS Computational Biology 2013)

    – ASTRIX (Analyzing Subsets of Transcriptional Regulators Influencing eXpression) uses gene expression data to identify regulatory interactions between transcription factors and their target genes. (Ref: Chandrasekaran et al. PNAS 2011)

    – PROM (Probabilistic Regulation of Metabolism) enables the quantitative integration of regulatory and metabolic networks to build genome-scale integrated metabolic–regulatory models (Ref: Chandrasekaran and Price, PNAS 2010)

     

    Research Overview: We develop computational algorithms that integrate omics measurements to create detailed genome-scale models of cellular networks. Some clinical applications of our algorithms include finding metabolic vulnerabilities in pathogens (M. tuberculosis) using PROM, and designing multi combination therapeutics for reducing antibiotic resistance using INDIGO.

    Research Overview: We develop computational algorithms that integrate omics measurements to create detailed genome-scale models of cellular networks. Some clinical applications of our algorithms include finding metabolic vulnerabilities in pathogens (M. tuberculosis) using PROM, and designing multi combination therapeutics for reducing antibiotic resistance using INDIGO.

    Jeffrey S. McCullough

    By |

    Jeffrey S. McCullough, PhD, is Associate Professor in the department of Health Management and Policy in the School of Public Health at the University of Michigan, Ann Arbor.

    Prof. McCullough’s research focuses on technology and innovation in health care with an emphasis on information technology (IT), pharmaceuticals, and empirical methods.  Many of his studies explored the effect of electronic health record (EHR) systems on health care quality and productivity. While the short-run gains from health IT adoption may be modest, these technologies form the foundation for a health information infrastructure. As scientists are just beginning to understand how to harness and apply medical information, this problem is complicated by the sheer complexity of medical care, the heterogeneity across patients, and the importance of treatment selection. His current work draws on methods from both machine learning and econometrics to address these issues. Current pharmaceutical studies examine the roles of consumer heterogeneity and learning about the value of products as well as the effect of direct-to-consumer advertising on health.

    The marginal effects of health IT on mortality by diagnosis and deciles of severity. We study the affect of hospitals' electronic health record (EHR) systems on patient outcomes. While we observe no benefits for the average patient, mortality falls significantly for high-risk patients in all EHR-sensitive conditions. These patterns, combined findings from other analyses, suggest that EHR systems may be more effective at supporting care coordination and information management than at rules-based clinical decision support. McCullough, Parente, and Town, "Health information technology and patient outcomes: the role of information and labor coordination." RAND Journal of Economics, Vol. 47, no. 1 (Spring 2016).

    The marginal effects of health IT on mortality by diagnosis and deciles of severity. We study the affect of hospitals’ electronic health record (EHR) systems on patient outcomes. While we observe no benefits for the average patient, mortality falls significantly for high-risk patients in all EHR-sensitive conditions. These patterns, combined findings from other analyses, suggest that EHR systems may be more effective at supporting care coordination and information management than at rules-based clinical decision support. McCullough, Parente, and Town, “Health information technology and patient outcomes: the role of information and labor coordination.” RAND Journal of Economics, Vol. 47, no. 1 (Spring 2016).

    Steven J. Katz

    By |

    Dr. Katz’s research addresses cancer treatment communication, decision-making, and quality of care. His work aims to examine the dynamics of how precision medicine presents itself in the exam room via provider and patient communication and shared decision-making. Dr. Katz leads the Cancer Surveillance and Outcomes Research Team (CanSORT), an interdisciplinary research program centered at the University of Michigan and focused on population and intervention studies of the quality of care and outcomes of cancer detection and treatment in diverse populations.  Dr. Katz and CanSORT have been collaborating with Surveillance, Epidemiology, and End Results (SEER) cancer registries since 2002 to study breast cancer treatment decision making at the population level. We obtain patient clinical and demographic information from SEER and combine this with surveys of patients and physicians to create comprehensive data sets that enable us to study testing and treatment trends and the challenges of individualizing treatments for breast cancer patients. In 2015 we added a new dimension to our research by partnering with evaluative testing firms to obtain tumor genomic and germline genetic test results for over 30,000 breast and ovarian cancer patients in the states of California and Georgia. We are also pursuing insurance claims data to assist with our analysis of physician network effects.

    Steven Katz, MD discusses BRCA and multigene sequence testing at the labs of Ambry Genetics.

    Steven Katz, MD discusses BRCA and multigene sequence testing at the labs of Ambry Genetics.

    Matthew Schipper

    By |

    Matthew Schipper, PhD, is Assistant Professor in the Departments of Radiation Oncology and Biostatistics. He received his Ph.D. in Biostatistics from the University of Michigan in 2006. Prior to joining the Radiation Oncology department he was a Research Investigator in the Department of Radiology at the University of Michigan and a consulting statistician at Innovative Analytics.

    Prof. Schipper’s research interests include:

    • Use of Biomarkers to Individualize Treatment – Selection of dose for cancer patients treated with Radiation Therapy (RT) must balance the increased efficacy with the increased toxicity associated with higher dose. Historically, a single dose has been selected for a population of patients (e.g. all stage III NSC lung cancer). However, the availability of new biologic markers for toxicity and efficacy allow the possibility of selecting a more personalized dose. I am interested in using statistical models for toxicity and efficacy as a function of RT dose and biomarkers to select an optimal dose for an individual patient. We are studying quantitative methods based on utilities to make this efficacy/toxicity tradeoff explicit and quantitative when biomarkers for one or multiple outcomes are available. We have proposed a simulation based method for studying the likely effects of any model or marker based dose selection on both toxicity and efficacy outcomes for a population of patients. In related projects, we are studying the role of correlation between the sensitivity of a patient’ tumor and normal tissues to radiation. We are also studying how to utilize these techniques in combination with baseline and/or mid-treatment adaptive image guided RT.
    • Early Phase Oncology Study Design – An increasingly common feature of phase I designs is the inclusion of 1 or more dose expansion cohorts (DECs) in which the MTD is first estimated using a 3+3 or other Phase I design and then a fixed number (often 10-20 in 1-10 cohorts) of patients are treated at the dose initially estimated to be the MTD. Such an approach has not been studied statistically or compared to alternative designs. We have shown that a CRM design, in which the dose-assignment mechanism is kept active for all patients, more accurately identifies the MTD and protects the safety of trial patients than a similarly sized DEC trial. It also meets the objective of treating 15 or more patients at the final estimated MTD.  A follow-up paper evaluating the role of DECs with a focus on efficacy estimation is in press at Annals of Oncology.

    Gilbert S. Omenn

    By |

    Gilbert Omenn, MD, PhD, is Professor of Computational Medicine & Bioinformatics with appointments in Human Genetics, Molecular Medicine & Genetics in the Medical School and Professor of Public Health in the School of Public Health and the Harold T. Shapiro Distinguished University Professor at the University of Michigan, Ann Arbor.

    Doctor Omenn’s current research interests are focused on cancer proteomics, splice isoforms as potential biomarkers and therapeutic tar- gets, and isoform-level and single-cell functional networks of transcripts and proteins. He chairs the global Human Proteome Project of the Human Proteome Organization.