The goal of my research is to leverage network analysis techniques to uncover how the brain mediates sex hormone influences on gendered behavior across the lifespan. Specifically, my data science research concerns the creation and application of person-specific connectivity analyses, such as unified structural equation models, to time series data; these are intensive longitudinal data, including functional neuroimages, daily diaries, and observations. I then use these data science methods to investigate the links between androgens (e.g., testosterone) and estradiol at key developmental periods, such as puberty, and behaviors that typically show sex differences, including aspects of cognition and psychopathology.

My research spans security, privacy, and optimization of data collection particularly as applied to the Smart Grid, an augmented and enhanced paradigm for the conventional power grid. I am particularly interested in optimization approaches that take a notion of security and/or privacy into the modeling explicitly. At the intersection of the Intelligent Transportation Systems, Smart Grid, and Smart Cities, I am interested in data privacy and energy usage in smart parking lots. Protection of data and availability, especially under assault through a Denial-of-Service attacks, represents another dimension of my area of research interests. I am working on developing data privacy-aware bidding applications for the Smart Grid Demand Response systems without relying on trusted third parties. Finally, I am interested in educational and pedagogical research about teaching computer science, Smart Grid, cyber security, and data privacy.

My research focuses on developing and applying computational and data-enabled methodology in the broader area of sustainability. Main thrusts are as follows:

- Human mobility dynamics. I am interested in mining large-scale real-world travel trajectory data to understand human mobility dynamics. This involves the processing and analyzing travel trajectory data, characterizing individual mobility patterns, and evaluating environmental impacts of transportation systems/technologies (e.g., electric vehicles, ride-sharing) based on individual mobility dynamics.
- Global supply chains. Increasingly intensified international trade has created a connected global supply chain network. I am interested in understanding the structure of the global supply chain network and economic/environmental performance of nations.
- Networked infrastructure systems. Many infrastructure systems (e.g., power grid, water supply infrastructure) are networked systems. I am interested in understanding the basic structural features of these systems and how they relate to the system-level properties (e.g., stability, resilience, sustainability).

A network visualization (force-directed graph) of the 2012 US economy using the industry-by-industry Input-Output Table (15 sectors) provided by BEA. Each node represents a sector. The size of the node represents the economic output of the sector. The size and darkness of links represent the value of exchanges of goods/services between sectors. An interactive version and other data visualizations are available at http://mingxugroup.org/

The GEMS (Graph Exploration and Mining at Scale) Lab develops new, fast and principled methods for mining and making sense of large-scale data. Within data mining, we focus particularly on interconnected or graph data, which are ubiquitous. Some examples include social networks, brain graphs or connectomes, traffic networks, computer networks, phonecall and email communication networks, and more. We leverage ideas from a diverse set of fields, including matrix algebra, graph theory, information theory, machine learning, optimization, statistics, databases, and social science.

At a high level, we enable single-source and multi-source data analysis by providing scalable methods for fusing data sources, relating and comparing them, and summarizing patterns in them. Our work has applications to exploration of scientific data (e.g., connectomics or brain graph analysis), anomaly detection, re-identification, and more. Some of our current research directions include:

*Scalable Network Discovery from non-Network Data*: Although graphs are ubiquitous, they are not always directly observed. Discovering and analyzing networks from non-network data is a task with applications in fields as diverse as neuroscience, genomics, energy, economics, and more. However, traditional network discovery approaches are computationally expensive. We are currently investigating network discovery methods (especially from time series) that are both fast and accurate.

*Graph similarity and Alignment with Representation Learning*: Graph similarity and alignment (or fusion) are core tasks for various data mining tasks, such as anomaly detection, classification, clustering, transfer learning, sense-making, de-identification, and more. We are exploring representation learning methods that can generalize across networks and can be used in such multi-source network settings.

*Scalable Graph Summarization and Interactive Analytics*: Recent advances in computing resources have made processing enormous amounts of data possible, but the human ability to quickly identify patterns in such data has not scaled accordingly. Thus, computational methods for condensing and simplifying data are becoming an important part of the data-driven decision making process. We are investigating ways of summarizing data in a domain-specific way, as well as leveraging such methods to support interactive visual analytics.

*Distributed Graph Methods*: Many mining tasks for large-scale graphs involve solving iterative equations efficiently. For example, classifying entities in a network setting with limited supervision, finding similar nodes, and evaluating the importance of a node in a graph, can all be expressed as linear systems that are solved iteratively. The need for faster methods due to the increase in the data that is generated has permeated all these applications, and many more. Our focus is on speeding up such methods for large-scale graphs both in sequential and distributed environments.

*User Modeling*: The large amounts of online user information (e.g., in social networks, online market places, streaming music and video services) have made possible the analysis of user behavior over time at a very large scale. Analyzing the user behavior can lead to better understanding of the user needs, better recommendations by service providers that lead to customer retention and user satisfaction, as well as detection of outlying behaviors and events (e.g., malicious actions or significant life events). Our current focus is on understanding career changes and predicting job transitions.

Pascal Van Hentenryck’s research is focused on artificial intelligence, data science, and optimization, with applications in mobility and transportation, energy systems, and computational social choice. He is currently leading the RITMO project, partly funded by MIDAS, which focuses on designing novel models of mobility, mathematical and algorithmic approaches to operate them optimally, and software architectures and data-privacy mechanisms to deploy them. The RITMO project is also in the process of deploying its technology in a number of significant case studies, with a particular focus on social equity.

Our lab’s research interests are in the areas of oncology bioinformatics, multimodality image analysis, and treatment outcome modeling. We operate at the interface of physics, biology, and engineering with the primary motivation to design and develop novel approaches to unravel cancer patients’ response to chemoradiotherapy treatment by integrating physical, biological, and imaging information into advanced mathematical models using combined top-bottom and bottom-top approaches that apply techniques of machine learning and complex systems analysis to first principles and evaluating their performance in clinical and preclinical data. These models could be then used to personalize cancer patients’ chemoradiotherapy treatment based on predicted benefit/risk and help understand the underlying biological response to disease. These research interests are divided into the following themes:

- Bioinformatics: design and develop large-scale datamining methods and software tools to identify robust biomarkers (-omics) of chemoradiotherapy treatment outcomes from clinical and preclinical data.
- Multimodality image-guided targeting and adaptive radiotherapy: design and develop hardware tools and software algorithms for multimodality image analysis and understanding, feature extraction for outcome prediction (radiomics), real-time treatment optimization and targeting.
- Radiobiology: design and develop predictive models of tumor and normal tissue response to radiotherapy. Investigate the application of these methods to develop therapeutic interventions for protection of normal tissue toxicities.

Elizaveta (Liza) Levina and her group work on various questions arising in the statistical analysis of large and complex data, especially networks and graphs. Our current focus is on developing rigorous and computationally efficient statistical inference on realistic models for networks. Current directions include community detection problems in networks (overlapping communities, networks with additional information about the nodes and edges, estimating the number of communities), link prediction (networks with missing or noisy links, networks evolving over time), prediction with data connected by a network (e.g., the role of friendship networks in the spread of risky behaviors among teenagers), and statistical analysis of samples of networks with applications to brain imaging, especially fMRI data from studies of mental health).

Siqian Shen is an Associate Professor of Industrial and Operations Engineering at the University of Michigan and also serves as an Associate Director in the Michigan Institute for Computational Discovery & Engineering (MICDE). Her theoretical research interests are in integer programming, stochastic/robust optimization, and network optimization. Applications include optimization and risk analysis of energy, healthcare, cloud-computing, and transportation systems. Her work has been supported by the National Science Foundation, Army Research Office, Department of Energy, and industrial funds. Her work has appeared in journals such as *Management Science*, *Operations Research*, *Mathematical Programming*, *Manufacturing and Service Operations Management*, *INFORMS Journal on Computing*, *Transportation Research Part B*, *IEEE Transactions on Power Systems*, and others. She is the recipient of the INFORMS Computing Society Best Student Paper award (runner-up), IIE Pritsker Doctoral Dissertation Award (1st Place), IBM Smarter Planet Innovation Faculty Award, and Department of Energy (DoE) Early Career Award.

My research group is engaged in fundamental research in the following areas: Statistical learning theory: We are developing theory and algorithms for predictions problems (e.g., learning to rank and multilabel learning) with complex label spaces and where the available human supervision is often weak. Sequential prediction in a game theoretic framework: We are trying to understand the power and limitations of sequential predictions algorithms when no probabilistic assumptions are placed on the data generating mechanism. High dimensional and network data analysis: We are developing scalable algorithms with provable performance guarantees for learning from high dimensional and network data. Optimization algorithms: We are creating incremental, distributed and parallel algorithms for machine learning problems arising in today’s data rich world. Reinforcement learning: We are synthesizing concepts and techniques from artificial intelligence, control theory and operations research for pushing the frontier in sequential decision making with a focus on delivering personalized health interventions via mobile devices. My research group is pursuing and continues to actively search for challenging machine learning problems that arise across disciplines including behavioral sciences, computational biology, computational chemistry, learning sciences, and network science.

Jon’s research focus is on nonlinear discrete optimization (NDO). Many practical engineering problems have physical aspects which are naturally modeled through smooth nonlinear functions, as well as design aspects which are often modeled with discrete variables. Research in NDO seeks to marry diverse techniques from classical areas of optimization, for example methods for smooth nonlinear optimization and methods for integer linear programming, with the idea of successfully attacking natural NDO models for practical engineering problems. On particular area of applied interest is environmental monitoring and the framework of maximum-entropy sampling.