Kathleen M Bergen, PhD, is Associate Research Scientist in the School for Environment and Sustainability at the University of Michigan, Ann Arbor. Dr. Bergen currently has interim administrative oversight of the SEAS Environmental Spatial Analysis Laboratory (ESALab) and is interim Director of the campus-wide Graduate Certificate Program in Spatial Analysis.
Prof. Bergen works in the areas of human dimensions of environmental change; remote sensing, GIS and biodiversity informatics; and environmental health and informatics. Her focus is on combining field and geospatial data and methods to study the pattern and process of ecological systems, biodiversity and health. She also strives to build bridges between science and social science to understand the implications of human actions on the social and natural systems of which we are a part. She teaches courses in Remote Sensing and Geographic Information Systems. Formerly she served as a founding member of the UM LIbrary’s MIRLYN implementation team, directed the University Map Collection, and set up the M-Link reference information network.
Brenda Gillespie, PhD, is Associate Director in Consulting for Statistics, Computing and Analytics Research (CSCAR) with a secondary appointment as Associate Research Professor in the department of Biostatistics in the School of Public Health at the University of Michigan, Ann Arbor. She provides statistical collaboration and support for numerous research projects at the University of Michigan. She teaches Biostatistics courses as well as CSCAR short courses in survival analysis, regression analysis, sample size calculation, generalized linear models, meta-analysis, and statistical ethics. Her major areas of expertise are clinical trials and survival analysis.
Prof. Gillespie’s research interests are in the area of censored data and clinical trials. One research interest concerns the application of categorical regression models to the case of censored survival data. This technique is useful in modeling the hazard function (instead of treating it as a nuisance parameter, as in Cox proportional hazards regression), or in the situation where time-related interactions (i.e., non-proportional hazards) are present. An investigation comparing various categorical modeling strategies is currently in progress.
Another area of interest is the analysis of cross-over trials with censored data. Brenda has developed (with M. Feingold) a set of nonparametric methods for testing and estimation in this setting. Our methods out-perform previous methods in most cases.
Brian Min, PhD, is Associate Professor of Political Science in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor. Prof. Min holds secondary appointments as Research Associate Professor in the Center for Political Studies and the Institute for Social Research.
Prof. Min studies the political economy of development with an emphasis on distributive politics, public goods provision, and energy politics. His research uses high-resolution satellite imagery to study the distribution of electricity across and within the developing world. He has collaborated closely with the World Bank using satellite technologies and statistical algorithms to monitor electricity access in India and Africa, including the creation of a web platform to visualize twenty years of change in light output for every village in India (http://nightlights.io).
Kai S. Cortina, PhD, is Professor of Psychology in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor.
Prof. Cortina’s major research revolves around the understanding of children’s and adolescents’ pathways into adulthood and the role of the educational system in this process. The academic and psycho-social development is analyzed from a life-span perspective exclusively analyzing longitudinal data over longer periods of time (e.g., from middle school to young adulthood). The hierarchical structure of the school system (student/classroom/school/district/state/nations) requires the use of statistical tools that can handle these kind of nested data.
The Corso group’s main research thrust is high-level computer vision and its relationship to human language, robotics and data science. They primarily focus on problems in video understanding such as video segmentation, activity recognition, and video-to-text; methodology, models leveraging cross-model cues to learn structured embeddings from large-scale data sources as well as graphical models emphasizing structured prediction over large-scale data sources are their emphasis. From biomedicine to recreational video, imaging data is ubiquitous. Yet, imaging scientists and intelligence analysts are without an adequate language and set of tools to fully tap the information-rich image and video. His group works to provide such a language. His long-term goal is a comprehensive and robust methodology of automatically mining, quantifying, and generalizing information in large sets of projective and volumetric images and video to facilitate intelligent computational and robotic agents that can natural interact with humans and within the natural world.
My current research focus is on modeling and simulating the value and benefits of various data sharing and policy trade offs. Typically these utilize system dynamics methodologies and tools.
I also have considerable experience across multiple industries with developing processes to enable industry and faculty to identify and solve data science problems using SAS tools.
Omid Dehzangi, PhD, is Assistant Professor of Computer and Information Science, College of Engineering and Computer Science, at the University of Michigan, Dearborn.
Wearable health technology is drawing significant attention for good reasons. The pervasive nature of such systems providing ubiquitous access to the continuous personalized data will transform the way people interact with each other and their environment. The resulting information extracted from these systems will enable emerging applications in healthcare, wellness, emergency response, fitness monitoring, elderly care support, long-term preventive chronic care, assistive care, smart environments, sports, gaming, and entertainment which create many new research opportunities and transform researches from various disciplines into data science which is the methodological terminology for data collection, data management, data analysis, and data visualization. Despite the ground-breaking potentials, there are a number of interesting challenges in order to design and develop wearable medical embedded systems. Due to limited available resources in wearable processing architectures, power-efficiency is demanded to allow unobtrusive and long-term operation of the hardware. Also, the data-intensive nature of continuous health monitoring requires efficient signal processing and data analytic algorithms for real-time, scalable, reliable, accurate, and secure extraction of relevant information from an overwhelmingly large amount of data. Therefore, extensive research in their design, development, and assessment is necessary. Embedded Processing Platform Design The majority of my work concentrates on designing wearable embedded processing platforms in order to shift the conventional paradigms from hospital-centric healthcare with episodic and reactive focus on diseases to patient-centric and home-based healthcare as an alternative segment which demands outstanding specialized design in terms of hardware design, software development, signal processing and uncertainty reduction, data analysis, predictive modeling and information extraction. The objective is to reduce the costs and improve the effectiveness of healthcare by proactive early monitoring, diagnosis, and treatment of diseases (i.e. preventive) as shown in Figure 1.
Our lab’s research interests are in the areas of oncology bioinformatics, multimodality image analysis, and treatment outcome modeling. We operate at the interface of physics, biology, and engineering with the primary motivation to design and develop novel approaches to unravel cancer patients’ response to chemoradiotherapy treatment by integrating physical, biological, and imaging information into advanced mathematical models using combined top-bottom and bottom-top approaches that apply techniques of machine learning and complex systems analysis to first principles and evaluating their performance in clinical and preclinical data. These models could be then used to personalize cancer patients’ chemoradiotherapy treatment based on predicted benefit/risk and help understand the underlying biological response to disease. These research interests are divided into the following themes:
- Bioinformatics: design and develop large-scale datamining methods and software tools to identify robust biomarkers (-omics) of chemoradiotherapy treatment outcomes from clinical and preclinical data.
- Multimodality image-guided targeting and adaptive radiotherapy: design and develop hardware tools and software algorithms for multimodality image analysis and understanding, feature extraction for outcome prediction (radiomics), real-time treatment optimization and targeting.
- Radiobiology: design and develop predictive models of tumor and normal tissue response to radiotherapy. Investigate the application of these methods to develop therapeutic interventions for protection of normal tissue toxicities.
As faculty member within the University of Michigan Transportation Research Institute, Dr. Flannagan currently serves as Director of the Center for Management of Information for Safe and Sustainable Transportation (CMISST) and Head of the Statistics and Methods Group for the CDC-funded UM Injury Center. Dr. Flannagan has over 20 years of experience conducting data analysis and research on injury risk related to motor vehicle crashes and was responsible for the development of a model of injury outcome that allows side-by-side comparison of public health, vehicle, roadway and post-crash interventions (utmost.umtri.umich.edu). She has also applied statistical methods to understanding and evaluating benefits of crash-avoidance technologies, including evaluating safety in automated vehicles, and works to develop novel applications of statistics to analysis of driving data. Her current work with CMISST involves the fusion and analysis of large state-level crash databases, which are useful in analyzing the effect of a variety of countermeasures on crash involvement and injury risk. In addition, her group is working to make data available to researchers to expand the community of experts in transportation data analysis.
Q & A with Dr. Carol Flannagan
Q: When in your career did you realize that data (broadly speaking) could open a lot of doors, in terms of research?
I have loved data analysis since my undergraduate Experimental Design course sophomore year when I learned about analysis of 2X2 tables. Beyond that, I didn’t necessarily see it as a career move or a research program. It was much later at UMTRI, when I started thinking about how much data we had scattered around the building and how great it would be to round up those data and share them, that I though explicitly about data per se as opening research doors.
In 2010, I was asked to head up the Transportation Data Center at UMTRI. In spite of its name, the group was doing very limited things with crash data at the time. After a few years, the group got a new name and some support to grow from UMOR. That, along with a number of research projects over the years, has led to our current incarnation with substantially upgraded data hosting for the state of Michigan’s crash data and strong capabilities in data analysis of all kinds of newer transportation data. For example, we have collaborated with the Engineering Systems group, GM and OnStar to conduct a series of studies funded by NHTSA to analyze driver response to crash avoidance systems using OnStar data on a large scale.
With the MIDAS transportation project, we are moving forward on several fronts. Most importantly, we are developing a high-performance data access and processing system to handle the large driving datasets that UMTRI has collected (and continues to collect). This system can integrate automated video processing with analysis and/or querying of time series data (e.g., speed, acceleration, etc.). For my group, this new capability opens new doors in both data sharing and data analysis.
Q: How has the field, and specifically the use of “Big Data,” changed during your career?
Wow… My first analyses as an undergrad were computed by hand as a matter of course. In grad school, since we needed to access terminals to use MTS (the Michigan Terminal System), it was often faster to just do it by hand (using a calculator and paper and pencil). What that meant is that computation was a true limitation on how much data could be analyzed. Even after personal computers were a regular fixture in research (and at home), computation of any size (e.g., more subjects or more variables) was still slow. My dissertation included a simulation component and I used to set simulations running every night before I went to bed. The simulation would finish around 2-3 a.m., and the computer would suddenly turn on, sending bright light into the room and waking my husband and me up. Those simulations can all be done in minutes or seconds now. Imagine how much more I could have done with current systems!
In the last 5 years, I’ve gotten much more involved in analysis of driving data to understand benefits of crash avoidance systems. We are often searching these data for extracts that contain certain kinds of situations that are relevant to such systems (e.g., hard braking events related to forward collision warnings). This is one use of Big Data—to observe enough that you can be sure of finding examples of what you’re interested in. However, in the last year or two, I have been more involved in full-dataset analyses, large-scale triggered data collection, and large-scale kinematic simulations. These are all enabled by faster computing and because of that, we can find richer answers to more questions.
Q: What role does your association with MIDAS play in this continuing development?
One of the advantages of being associated with MIDAS is that it gives me access to faculty who are interested in data as opposed to transportation. It’s not that I need other topic areas, but I often find that when I listen to data and methods talks in totally different content areas, I can see how those methods could apply to problems I care about. For example, the challenge grant in the social sciences entitled “Computational Approaches for the Construction of Novel Macroeconomic Data,” is tackling a problem for economists that transportation researchers share. That is, how do you help researchers who may not have deep technical skill in data querying get at extracts from very large, complex datasets easily? Advances that they make in that project could be transferred to transportation datasets, which share many of the characteristics of social-media datasets.
Another advantage is the access to students who are in data-oriented programs (e.g., statistics, data science, computer science, information, etc.) who want real-world experiences using their skills. I have had a number of data science students reach out to me and worked closely with two of them on independent studies where we worked together on Big Data analysis problems in transportation. One was related to measuring and tracking vehicle safety in automated vehicles and the other was in text mining of a complaints dataset to try to find failure patterns that might indicate the presence of a vehicle defect.
Q: What are the next research challenges for transportation generally and your work specifically?
Transportation is in the midst of a once-in-a-lifetime transformation. In my lifetime, I expect to be driven to the senior center and my gerontologist appointments by what amounts to a robot (on wheels). Right now, I’m working on research problems that will help bring that to fruition. Data science is absolutely at the core of that transformation and the problems are wide open. I’m particularly concerned that our research datasets need to be managed in a way that opens them to a broad audience where we support both less technical interactions (e.g., data extracts with a very simple querying system) and much more technical interactions (e.g., full-dataset automatic video processing to identify drivers using cell phones in face video). I’m also interested in new and efficient large-scale data collection methods, particularly those based on the idea of “smart” or triggered sampling rather than just acquisition of huge datasets “because the information I want is somewhere in there…if only I can find it…”
My own work has mostly been about safety, and although we envision a future without traffic fatalities (or ideally without crashes), the current fatality count has gone up over the past couple of years. Thus, I spend time analyzing crash data for new insights into changes in safety over time and even how the economy influences fatality counts. Much of my work is for the State of Michigan around predicting how many fatalities there will be in the next few years and identifying the benefits and potential benefits of various interventions. My web tool, UTMOST, (the Unified Theory Mapping Opportunities for Safety Technologies) allows visualization of the benefits, or potential benefits, of many different kinds of interventions, including public health policy, vehicle technology, and infrastructure improvements. I think this kind of integrated approach will be necessary to achieve the goal of zero fatalities over the next 20 years. Finally, a significant part of my research program will continue to be development and use of statistical methods to measure, predict, and understand safety in transportation. How can we tell if AVs are safer than humans? What should future traffic safety data systems look like? How can we integrate data systems (e.g., crash and injury outcome) to better understand safety and figure out how to prioritize countermeasure development? How can machine learning and other big data statistical tools help us make the most of driving data?