Veera Baladandayuthapani

By |

Dr. Veera Baladandayuthapani is currently a Professor in the Department of Biostatistics at University of Michigan (UM), where he is also the Associate Director of the Center for Cancer Biostatistics. He joined UM in Fall 2018 after spending 13 years in the Department of Biostatistics at University of Texas MD Anderson Cancer Center, Houston, Texas, where was a Professor and Institute Faculty Scholar and held adjunct appointments at Rice University, Texas A&M University and UT School of Public Health. His research interests are mainly in high-dimensional data modeling and Bayesian inference. This includes functional data analyses, Bayesian graphical models, Bayesian semi-/non-parametric models and Bayesian machine learning. These methods are motivated by large and complex datasets (a.k.a. Big Data) such as high-throughput genomics, epigenomics, transcriptomics and proteomics as well as high-resolution neuro- and cancer- imaging. His work has been published in top statistical/biostatistical/bioinformatics and biomedical/oncology journals. He has also co-authored a book on Bayesian analysis of gene expression data. He currently holds multiple PI-level grants from NIH and NSF to develop innovative and advanced biostatistical and bioinformatics methods for big datasets in oncology. He has also served as the Director of the Biostatistics and Bioinformatics Cores for the Specialized Programs of Research Excellence (SPOREs) in Multiple Myeloma and Lung Cancer and Biostatistics&Bioinformatics platform leader for the Myeloma and Melanoma Moonshot Programs at MD Anderson. He is a fellow of the American Statistical Association and an elected member of the International Statistical Institute. He currently serves as an Associate Editor for Journal of American Statistical Association, Biometrics and Sankhya.

 

An example of horizontal (across cancers) and vertical (across multiple molecular platforms) data integration. Image from Ha et al (Nature Scientific Reports, 2018; https://www.nature.com/articles/s41598-018-32682-x)

Neda Masoud

By |

The future of transportation lies at the intersection of two emerging trends, namely, the sharing economy and connected and automated vehicle technology. Our research group investigates the impact of these two major trends on the future of mobility, quantifying the benefits and identifying the challenges of integrating these technologies into our current systems.

Our research on shared-use mobility systems focuses on peer-to-peer (P2P) ridesharing and multi-modal transportation. We provide: (i) operational tools and decision support systems for shared-use mobility in legacy as well as connected and automated transportation systems. This line of research focuses on system design as well as routing, scheduling, and pricing mechanisms to serve on-demand transportation requests; (ii) insights for regulators and policy makers on mobility benefits of multi-modal transportation; (ii) planning tools that would allow for informed regulations of sharing economy.

In another line of research we investigate challenges faced by the connected automated vehicle technology before mass adoption of this technology can occur. Our research mainly focuses on (i) transition of control authority between the human driver and the autonomous entity in semi-autonomous (level 3 SAE autonomy) vehicles; (ii) incorporating network-level information supplied by connected vehicle technology into traditional trajectory planning; (iii) improving vehicle localization by taking advantage of opportunities provided by connected vehicles; and (iv) cybersecurity challenges in connected and automated systems. We seek to quantify the mobility and safety implications of this disruptive technology, and provide insights that can allow for informed regulations.

Xiang Zhou

By |

My research is focused on developing efficient and effective statistical and computational methods for genetic and genomic studies. These studies often involve large-scale and high-dimensional data; examples include genome-wide association studies, epigenome-wide association studies, and various functional genomic sequencing studies such as bulk and single cell RNAseq, bisulfite sequencing, ChIPseq, ATACseq etc. Our method development is often application oriented and specifically targeted for practical applications of these large-scale genetic and genomic studies, thus is not restricted in a particular methodology area. Our previous and current methods include, but are not limited to, Bayesian methods, mixed effects models, factor analysis models, sparse regression models, deep learning algorithms, clustering algorithms, integrative methods, spatial statistics, and efficient computational algorithms. By developing novel analytic methods, I seek to extract important information from these data and to advance our understanding of the genetic basis of phenotypic variation for various human diseases and disease related quantitative traits.

A statistical method recently developed in our group aims to identify tissues that are relevant to diseases or disease related complex traits, through integrating tissue specific omics studies (e.g. ROADMAP project) with genome-wide association studies (GWASs). Heatmap displays the rank of 105 tissues (y-axis) in terms of their relevance for each of the 43 GWAS traits (x-axis) evaluated by our method. Traits are organized by hierarchical clustering. Tissues are organized into ten tissue groups.

Jinseok Kim

By |

Jinseok Kim, Ph.D., is Research Assistant Professor in the Institute for Social Research at the University of Michigan, Ann Arbor.  Prof. Kim works on resolving named entity ambiguity in large-scale scholarly data (publication, patent, and funding records) in digital libraries. Especially, his current research is focused on developing methods for disambiguating author and affiliation names at a digital library scale using various supervised machine learning approaches trained on automatically labeled data . Disambiguated data from multiple sources will be integrated to be analyzed for insights into research production, scientific collaboration, funding evaluation, and research policy at a national level.

Peter Adriaens

By |

My research focus is on the development and application of machine learning tools to large scale financial and unstructured (textual) data to extract, quantify and predict risk profiles and investment grade rating of private and public companies.  Example datasets include social media and financial aggregators such as Bloomberg, Pitchbook, and Privco.

9.9.2020 MIDAS Faculty Research Pitch Video.

Jon Zelner

By |

Jon Zelner, PhD, is Assistant Professor in the department of Epidemiology in the University of Michigan School of Public Health. Dr. Zelner holds a second appointment in the Center for Social Epidemiology and Population Health.

Dr. Zelner’s research is focused on using spatial analysis, social network analyisis and dynamic modeling to prevent infectious diseases, with a focus on tuberculosis and diarrheal disease. Jon is also interested in understanding how the social and biological causes of illness interact to generate observable patterns of disease in space and in social networks, across outcomes ranging from infection to mental illness.

 

A large spatial cluster of multi-drug resistant tuberculosis (MDR-TB) cases in Lima, Peru is highlighted in red. A key challenge in my work is understanding why these cases cluster in space: can social, spatial, and genetic data tell us where transmission is occurring and how to interrupt it?

A large spatial cluster of multi-drug resistant tuberculosis (MDR-TB) cases in Lima, Peru is highlighted in red. A key challenge in my work is understanding why these cases cluster in space: can social, spatial, and genetic data tell us where transmission is occurring and how to interrupt it?

 

 

Adriene Beltz

By |

The goal of my research is to leverage network analysis techniques to uncover how the brain mediates sex hormone influences on gendered behavior across the lifespan. Specifically, my data science research concerns the creation and application of person-specific connectivity analyses, such as unified structural equation models, to time series data; these are intensive longitudinal data, including functional neuroimages, daily diaries, and observations. I then use these data science methods to investigate the links between androgens (e.g., testosterone) and estradiol at key developmental periods, such as puberty, and behaviors that typically show sex differences, including aspects of cognition and psychopathology.

A network map showing the directed connections among 25 brain regions of interest in the resting state frontoparietal network for an individual; data were acquired via functional magnetic resonance imaging. Black lines depict connections common across individuals in the sample, gray lines depict connections specific to this individual, solid lines depict contemporaneous connections (occurring in the same volume), and dashed lines depict lagged connections (occurring between volumes).

A network map showing the directed connections among 25 brain regions of interest in the resting state frontoparietal network for an individual; data were acquired via functional magnetic resonance imaging. Black lines depict connections common across individuals in the sample, gray lines depict connections specific to this individual, solid lines depict contemporaneous connections (occurring in the same volume), and dashed lines depict lagged connections (occurring between volumes).

Kai S. Cortina

By |

Kai S. Cortina, PhD, is Professor of Psychology in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor.

Prof. Cortina’s major research revolves around the understanding of children’s and adolescents’ pathways into adulthood and the role of the educational system in this process. The academic and psycho-social development is analyzed from a life-span perspective exclusively analyzing longitudinal data over longer periods of time (e.g., from middle school to young adulthood). The hierarchical structure of the school system (student/classroom/school/district/state/nations) requires the use of statistical tools that can handle these kind of nested data.

 

Mingyan Liu

By |

Mingyan Liu, PhD, is Professor of Electrical Engineering and Computer Science, College of Engineering, at the University of Michigan, Ann Arbor.

Prof. Liu’s research interest lies in optimal resource allocation, sequential decision theory, online and machine learning, performance modeling, analysis, and design of large-scale, decentralized, stochastic and networked systems, using tools including stochastic control, optimization, game theory and mechanism design. Her most recent research activities involve sequential learning, modeling and mining of large scale Internet measurement data concerning cyber security, and incentive mechanisms for inter-dependent security games. Within this context, her research group is actively working on the following directions.

1. Cyber security incident forecast. The goal is to predict an organization’s likelihood of having a cyber security incident in the near future using a variety of externally collected Internet measurement data, some of which capture active maliciousness (e.g., spam and phishing/malware activities) while others capture more latent factors (e.g., misconfiguration and mismanagement). While machine learning techniques have been extensively used for detection in the cyber security literature, using them for prediction has rarely been done. This is the first study on the prediction of broad categories of security incidents on an organizational level. Our work to date shows that with the right choice of feature set, highly accurate predictions can be achieved with a forecasting window of 6-12 months. Given the increasing amount of high profile security incidents (Target, Home Depot, JP Morgan Chase, and Anthem, just to name a few) and the amount of social and economic cost they inflict, this work will have a major impact on cyber security risk management.

2. Detect propagation in temporal data and its application to identifying phishing activities. Phishing activities propagate from one network to another in a highly regular fashion, a phenomenon known as fast-flux, though how the destination networks are chosen by the malicious campaign remains unknown. An interesting challenge arises as to whether one can use community detection methods to automatically extract those networks involved in a single phishing campaign; the ability to do so would be critical to forensic analysis. While there have been many results on detecting communities defined as subsets of relatively strongly connected entities, the phishing activity exhibits a unique propagating property that is better captured using an epidemic model. By using a combination of epidemic modeling and regression we can identify this type of propagating community with reasonable accuracy; we are working on alternative methods as well.

3. Data-driven modeling of organizational and end-user security posture. We are working to build models that accurately capture the cyber security postures of end-users as well as organizations, using large quantities of Internet measurement data. One domain is on how software vendors disclose security vulnerabilities in their products, how they deploy software upgrades and patches, and in turn, how end users install these patches; all these elements combined lead to a better understanding of the overall state of vulnerability of a given machine and how that relates to user behaviors. Another domain concerns the interconnectedness of today’s Internet which implies that what we see from one network is inevitably related to others. We use this connection to gain better insight into the conditions of not just a single network viewed in isolation, but multiple networks viewed together.

A predictive analytics approach to forecasting cyber security incidents. We start from Internet-scale measurement on the security postures of network entities. We also collect security incident reports to use as labels in a supervised learning framework. The collected data then goes through extensive processing and domain-specific feature extraction. Features are then used to train a classifier that generates predictions when we input new features, on the likelihood of a future incident for the entity associated with the input features. We are also actively seeking to understand the causal relationship among different features and the security interdependence among different network entities. Lastly, risk prediction helps us design better incentive mechanisms which is another facet of our research in this domain.

A predictive analytics approach to forecasting cyber security incidents. We start from Internet-scale measurement on the security postures of network entities. We also collect security incident reports to use as labels in a supervised learning framework. The collected data then goes through extensive processing and domain-specific feature extraction. Features are then used to train a classifier that generates predictions when we input new features, on the likelihood of a future incident for the entity associated with the input features. We are also actively seeking to understand the causal relationship among different features and the security interdependence among different network entities. Lastly, risk prediction helps us design better incentive mechanisms which is another facet of our research in this domain.