Xiaoquan (William) Wen is an Associate Professor of Biostatistics. He received his PhD in Statistics from the University of Chicago in 2011 and joined the faculty at the University of Michigan in the same year. His research centers on developing Bayesian and computational statistical methods to answer interesting scientific questions arising from genetics and genomics.
Dr. Teasley’s research has focused on issues of collaboration and learning, looking specifically at how sociotechnical systems can be used to support effective collaborative processes and successful learning outcomes. As Director of the LED lab, she leads learning analytics-based research to investigate how instructional technologies and digital media are used to innovate teaching, learning, and collaboration. The LED Lab is committed to providing a significant contribution to scholarship about learning at Michigan and in the broader field as well, by building an empirical evidentiary base for the design and support of technology rich learning environments.
Jeremy Taylor, PhD, is the Pharmacia Research Professor of Biostatistics in the School of Public Health and Professor in the Department of Radiation Oncology in the School of Medicine at the University of Michigan, Ann Arbor. He is the director of the University of Michigan Cancer Center Biostatistics Unit and director of the Cancer/Biostatistics training program. He received his B.A. in Mathematics from Cambridge University and his Ph.D. in Statistics from UC Berkeley. He was on the faculty at UCLA from 1983 to 1998, when he moved to the University of Michigan. He has had visiting positions at the Medical Research Council, Cambridge, England; the University of Adelaide; INSERM, Bordeaux and CSIRO, Sydney, Australia. He is a previously winner of the Mortimer Spiegelman Award from the American Public Health Association and the Michael Fry Award from the Radiation Research Society. He has worked in various areas of Statistics and Biostatistics, including Box-Cox transformations, longitudinal and survival analysis, cure models, missing data, smoothing methods, clinical trial design, surrogate and auxiliary variables. He has been heavily involved in collaborations in the areas of radiation oncology, cancer research and bioinformatics.
I have broad interests and expertise in developing statistical methodology and applying it in biomedical research, particularly in cancer research. I have undertaken research in power transformations, longitudinal modeling, survival analysis particularly cure models, missing data methods, causal inference and in modeling radiation oncology related data. Recent interests, specifically related to cancer, are in statistical methods for genomic data, statistical methods for evaluating cancer biomarkers, surrogate endpoints, phase I trial design, statistical methods for personalized medicine and prognostic and predictive model validation. I strive to develop principled methods that will lead to valid interpretations of the complex data that is collected in biomedical research.
As faculty member within the University of Michigan Transportation Research Institute, Dr. Flannagan currently serves as Director of the Center for Management of Information for Safe and Sustainable Transportation (CMISST) and Head of the Statistics and Methods Group for the CDC-funded UM Injury Center. Dr. Flannagan has over 20 years of experience conducting data analysis and research on injury risk related to motor vehicle crashes and was responsible for the development of a model of injury outcome that allows side-by-side comparison of public health, vehicle, roadway and post-crash interventions (utmost.umtri.umich.edu). She has also applied statistical methods to understanding and evaluating benefits of crash-avoidance technologies, including evaluating safety in automated vehicles, and works to develop novel applications of statistics to analysis of driving data. Her current work with CMISST involves the fusion and analysis of large state-level crash databases, which are useful in analyzing the effect of a variety of countermeasures on crash involvement and injury risk. In addition, her group is working to make data available to researchers to expand the community of experts in transportation data analysis.
Q & A with Dr. Carol Flannagan
Q: When in your career did you realize that data (broadly speaking) could open a lot of doors, in terms of research?
I have loved data analysis since my undergraduate Experimental Design course sophomore year when I learned about analysis of 2X2 tables. Beyond that, I didn’t necessarily see it as a career move or a research program. It was much later at UMTRI, when I started thinking about how much data we had scattered around the building and how great it would be to round up those data and share them, that I though explicitly about data per se as opening research doors.
In 2010, I was asked to head up the Transportation Data Center at UMTRI. In spite of its name, the group was doing very limited things with crash data at the time. After a few years, the group got a new name and some support to grow from UMOR. That, along with a number of research projects over the years, has led to our current incarnation with substantially upgraded data hosting for the state of Michigan’s crash data and strong capabilities in data analysis of all kinds of newer transportation data. For example, we have collaborated with the Engineering Systems group, GM and OnStar to conduct a series of studies funded by NHTSA to analyze driver response to crash avoidance systems using OnStar data on a large scale.
With the MIDAS transportation project, we are moving forward on several fronts. Most importantly, we are developing a high-performance data access and processing system to handle the large driving datasets that UMTRI has collected (and continues to collect). This system can integrate automated video processing with analysis and/or querying of time series data (e.g., speed, acceleration, etc.). For my group, this new capability opens new doors in both data sharing and data analysis.
Q: How has the field, and specifically the use of “Big Data,” changed during your career?
Wow… My first analyses as an undergrad were computed by hand as a matter of course. In grad school, since we needed to access terminals to use MTS (the Michigan Terminal System), it was often faster to just do it by hand (using a calculator and paper and pencil). What that meant is that computation was a true limitation on how much data could be analyzed. Even after personal computers were a regular fixture in research (and at home), computation of any size (e.g., more subjects or more variables) was still slow. My dissertation included a simulation component and I used to set simulations running every night before I went to bed. The simulation would finish around 2-3 a.m., and the computer would suddenly turn on, sending bright light into the room and waking my husband and me up. Those simulations can all be done in minutes or seconds now. Imagine how much more I could have done with current systems!
In the last 5 years, I’ve gotten much more involved in analysis of driving data to understand benefits of crash avoidance systems. We are often searching these data for extracts that contain certain kinds of situations that are relevant to such systems (e.g., hard braking events related to forward collision warnings). This is one use of Big Data—to observe enough that you can be sure of finding examples of what you’re interested in. However, in the last year or two, I have been more involved in full-dataset analyses, large-scale triggered data collection, and large-scale kinematic simulations. These are all enabled by faster computing and because of that, we can find richer answers to more questions.
Q: What role does your association with MIDAS play in this continuing development?
One of the advantages of being associated with MIDAS is that it gives me access to faculty who are interested in data as opposed to transportation. It’s not that I need other topic areas, but I often find that when I listen to data and methods talks in totally different content areas, I can see how those methods could apply to problems I care about. For example, the challenge grant in the social sciences entitled “Computational Approaches for the Construction of Novel Macroeconomic Data,” is tackling a problem for economists that transportation researchers share. That is, how do you help researchers who may not have deep technical skill in data querying get at extracts from very large, complex datasets easily? Advances that they make in that project could be transferred to transportation datasets, which share many of the characteristics of social-media datasets.
Another advantage is the access to students who are in data-oriented programs (e.g., statistics, data science, computer science, information, etc.) who want real-world experiences using their skills. I have had a number of data science students reach out to me and worked closely with two of them on independent studies where we worked together on Big Data analysis problems in transportation. One was related to measuring and tracking vehicle safety in automated vehicles and the other was in text mining of a complaints dataset to try to find failure patterns that might indicate the presence of a vehicle defect.
Q: What are the next research challenges for transportation generally and your work specifically?
Transportation is in the midst of a once-in-a-lifetime transformation. In my lifetime, I expect to be driven to the senior center and my gerontologist appointments by what amounts to a robot (on wheels). Right now, I’m working on research problems that will help bring that to fruition. Data science is absolutely at the core of that transformation and the problems are wide open. I’m particularly concerned that our research datasets need to be managed in a way that opens them to a broad audience where we support both less technical interactions (e.g., data extracts with a very simple querying system) and much more technical interactions (e.g., full-dataset automatic video processing to identify drivers using cell phones in face video). I’m also interested in new and efficient large-scale data collection methods, particularly those based on the idea of “smart” or triggered sampling rather than just acquisition of huge datasets “because the information I want is somewhere in there…if only I can find it…”
My own work has mostly been about safety, and although we envision a future without traffic fatalities (or ideally without crashes), the current fatality count has gone up over the past couple of years. Thus, I spend time analyzing crash data for new insights into changes in safety over time and even how the economy influences fatality counts. Much of my work is for the State of Michigan around predicting how many fatalities there will be in the next few years and identifying the benefits and potential benefits of various interventions. My web tool, UTMOST, (the Unified Theory Mapping Opportunities for Safety Technologies) allows visualization of the benefits, or potential benefits, of many different kinds of interventions, including public health policy, vehicle technology, and infrastructure improvements. I think this kind of integrated approach will be necessary to achieve the goal of zero fatalities over the next 20 years. Finally, a significant part of my research program will continue to be development and use of statistical methods to measure, predict, and understand safety in transportation. How can we tell if AVs are safer than humans? What should future traffic safety data systems look like? How can we integrate data systems (e.g., crash and injury outcome) to better understand safety and figure out how to prioritize countermeasure development? How can machine learning and other big data statistical tools help us make the most of driving data?
Kevin Ward, MD, is Professor of Emergency Medicine in the department of Emergency Medicine in the University of Michigan Medical School.
Dr. Ward is the director of the Weil Institute for Critical Care Research and Innovation. Dr. Ward’s research interests span the field of critical illness and injury ranging from combat casualty care to the intensive care unit. His approach is to develop and leverage broad platform technologies capable of use throughout all echelons of care of the critically ill and injured as well as in all age groups. Dr. Ward’s work has been funded by the NIH, Department of Defense, NSF, and industry.
Kerby Shedden has broad interests involving applied statistics, data science and computing with data. Through his work directing the data science consulting service he has worked in a wide variety of application domains including numerous areas within health science, social science, and transportation research. A current major focus is development of software tools that exploit high performance computing infrastructure for statistical analysis of health records, and sensor data from vehicles and road networks.
Margaret C. Levenstein, PhD, is the Director of ICPSR, Co-Director, Michigan Federal Statistical Research Data Center, Research Professor, School of Information, Research Professor, Survey Research Center, Institute for Social Research, and Adjunct Professor of Business Economics and Public Policy, Ross School of Business at the University of Michigan, Ann Arbor.
The Language and Information Technologies (LIT) lab, directed by Rada Mihalcea, conducts research in natural language processing, information retrieval, and applied machine learning. The group specifically focuses on projects concerned with text semantics (word/text similarity, large semantic networks), behavior analysis (multilingual opinion analysis, multimodal models for deception detection, emotion recognition, alertness detection, stress/anxiety detection, analysis of counseling speech), big data for cross-cultural analysis (geotagging, understanding cross-cultural differences and worldview), educational applications (pedagogical search engines, automatic short answer grading, conversational technologies for student advising).
Several of the projects in the LIT lab are interdisciplinary, acknowledging the fact that language can be used to deepen our understanding in many different fields, such as psychology, sociology, history, and others. Some of the ongoing projects in the lab are collaborations with psychologists and sociologists, and target a rich modeling of human behavior through language analysis, seeking answers to questions such as “what are the core values of a culture?” and “are there differences in how different groups of people perceive the surrounding world?” The lab is also actively working on multimodal projects to track and understand human behavior, where language analysis is complemented with other channels such as facial expressions, gestures, and physiological signals.
Of interest, Prof. Mihalcea was quoted in a story about sexism and today’s virtual assistants such as Amazon’s Alexa, Apple’s Siri, and Microsoft’s Cortana; Refinery29.
I am broadly interested in statistical inference, which is informally defined as the process of turning data into prediction and understanding. I like to work with richly structured data, such as those extracted from texts, images and other spatiotemporal signals. In recent years I have gravitated toward a field in statistics known as Bayesian nonparametrics, which provides a fertile and powerful mathematical framework for the development of many computational and statistical modeling ideas. My motivation for all this came originally from an early interest in machine learning, which continues to be a major source of research interest. A primary focus of my group’s research in machine learning to develop more effective inference algorithms using stochastic, variational and geometric viewpoints.
The Athey Lab in the Department of Computational Medicine and Bioinformatics (DCM&B) University of Michigan Medical School, is led by Dr. Brian Athey (see Atheylab.ccmb.med.umich.edu).
The lab is working on two complementary domains of research and development.
1. The Athey Lab’s recent research interests are in the creation and use of bioinformatics pipelines and machine learning methods to radically improve the efficacy of psychiatric pharmacogenomics—allowing patients to take the most effective drug for their illness and suffer the fewest side effects. This area of research centers on the exploration of the ‘pharmacoepigenome’ in psychiatry, neurology, anesthesia and addiction medicine. This research employs high-throughput 4D microscopic imaging of enhancers, promoters and chromatin features, using fluorescence in situ hybridization (FISH). These methods are coupled with Hi-C chromatin conformation capture, chromatin state annotation, localization in postmortem human brain tissue and induced neuronal pluripotent stem cells, and machine learning for identification of regulatory variants, to provide insight into the genetic and epigenetic mechanisms of inter-individual and inter-cohort differences in psychotropic drug response
2. The Athey Lab is also developing new high-throughput methods to analyze images of genes in the context of the cellular nucleus to better understand the machinery of bioinformatics in context. One main area of research is the application of high resolution fluorescence optical microscopy coupled with high-throughput analysis, 3D imaging and machine learning to explore the chromatin structure and nuclear architecture of cells. This research emphasizes the convergence between 3D structural predictions and 3D structural measurements with microscopy, to provide insight into the transcriptional architecture of the interphase nucleus.
This area of research centers on the exploration of the ‘pharmacoepigenome’ in psychiatry, neurology, anesthesia and addiction medicine. This research employs high-throughput 4D microscopic imaging of enhancers, promoters and chromatin features, using fluorescence in situ hybridization (FISH). These methods are coupled with Hi-C chromatin conformation capture, chromatin state annotation, localization in postmortem human brain tissue and induced neuronal pluripotent stem cells, and machine learning for identification of regulatory variants, to provide insight into the genetic and epigenetic mechanisms of inter-individual and inter-cohort differences in psychotropic drug response.
Collaborations: The lab works very closely with Assurex Health, Inc. (Mason, Ohio) on project 1. This work is governed by a Regents-approved Master Agreement between U-M and Assurex Health, Inc. Similarly, the lab collaborates closely with the tranSMART Foundation (tF), and this is also governed by a Master Agreement between U-M and tF.
The lab collaborates with the Brady Urological Institute at Johns Hopkins Medical School, lead by Dr. Ken Pienta, to build on their extensive 2D characterization of prostate tumors, by the introduction of simple chromatin dyes, advanced biomarkers, and 3D imaging systems.
The lab works closely with Dr. John Wiley of University of Michigan Health System, studying the effect of glucocorticoids on the neuroblastoma based cell line Sy5y before and after treatment with retinoic acid and BDNF, particularly in their terminally differentiated condition.
The lab also collaborates with Dr. Christoph Cremer from the Institute of Molecular Biology in Mainz, Germany, investigating super-resolution microscopy techniques.