Dr. Gen Li is an Assistant Professor in the Department of Biostatistics. He is devoted to developing new statistical methods for analyzing complex biomedical data, including multi-way tensor array data, multi-view data, and compositional data. His methodological research interests include dimension reduction, predictive modeling, association analysis, and functional data analysis. He also has research interests in scientific domains including microbiome and genomics.
Yixin Wang works in the fields of Bayesian statistics, machine learning, and causal inference, with applications to recommender systems, text data, and genetics. She also works on algorithmic fairness and reinforcement learning, often via connections to causality. Her research centers around developing practical and trustworthy machine learning algorithms for large datasets that can enhance scientific understandings and inform daily decision-making. Her research interests lie in the intersection of theory and applications.
My research focuses on building infrastructure for public health and health science research organizations to take advantage of cloud computing, strong software engineering practices, and MLOps (machine learning operations). By equipping biomedical research groups with tools that facilitate automation, better documentation, and portable code, we can improve the reproducibility and rigor of science while scaling up the kind of data collection and analysis possible.
Research topics include:
1. Open source software and cloud infrastructure for research,
2. Software development practices and conventions that work for academic units, like labs or research centers, and
3. The organizational factors that encourage best practices in reproducibility, data management, and transparency
The practice of science is a tug of war between competing incentives: the drive to do a lot fast, and the need to generate reproducible work. As data grows in size, code increases in complexity and the number of collaborators and institutions involved goes up, it becomes harder to preserve all the “artifacts” needed to understand and recreate your own work. Technical AND cultural solutions will be needed to keep data-centric research rigorous, shareable, and transparent to the broader scientific community.
My research concentrates on the area of bioinformatics, proteomics, and data integration. I am particularly interested in mass spectrometry-based proteomics, software development for proteomics, cancer proteogenomics, and transcriptomics. The computational methods and tools previously developed by my colleagues and me, such as PepExplorer, MSFragger, Philosopher, and PatternLab for Proteomics, are among the most referred proteome informatics tools and are used by hundreds of laboratories worldwide.
I am also a Proteogenomics Data Analysis Center (UM-PGDAC) member as part of the NCI’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) initiative for processing and analyzing hundreds of cancer proteomics samples. UM-PGDAC develops advanced computational infrastructure for comprehensive and global characterization of genomics, transcriptomics, and proteomics data collected from several human tumor cohorts using NCI-provided biospecimens. Since 2019 I have been working as a bioinformatics data analyst with the University of Michigan Proteomics Resource Facility, which provides state-of-the-art capabilities in proteomics to the University of Michigan investigators, including Rogel Cancer Center investigators as Proteomics Shared Resource.
My research interest lies in applying data science for actionable transformation of human health from the bench to bedside. Current research focus areas include cutting edge single-cell sequencing informatics and genomics; precision medicine through integration of multi-omics data types; novel modeling and computational methods for biomarker research; public health genomics. I apply my biomedical informatics and analytical expertise to study diseases such as cancers, as well the impact of pregnancy/early life complications on later life diseases.
The current goal of our research is to learn enough about the physiology and ecology of microbes and microbial communities in the gut that we are able to engineer the gut microbiome to improve human health. The first target of our engineering is the production of butyrate – a common fermentation product of some gut microbes that is essential for human health. Butyrate is the preferred energy source for mitochondria in the epithelial cells lining the gut and it also regulates their gene expression.
One of the most effective ways to influence the composition and metabolism of the gut microbiota is through diet. In an interventional study, we have tracked responses in the composition and fermentative metabolism of the gut microtiota in >800 healthy individuals. Emerging patterns suggest several configurations of the microbiome that can result in increased production of butyrate acid. We have isolated the microbes that form an anaerobic food web to convert dietary fiber to butyrate and continue to make discoveries about their physiology and interactions. Based on these results, we have initiated a clinical trial in which we are hoping to prevent the development of Graft versus Host Disease following bone marrow transplants by managing butyrate production by the gut microbiota.
We are also beginning to track hundreds of other metabolites from the gut microbiome that may influence human health. We use metagenomes and metabolomes to identify patterns that link the microbiota with their metabolites and then test those models in human organoids and gnotobiotic mice colonized with synthetic communities of microbes. This blend of wet-lab research in basic microbiology, data science and in ecology is moving us closer to engineering the gut microbiome to improve human health.
My lab has two main areas of focus: molecular characteristics of head and neck cancer, and the intersection of regulatory genomics and pathway analysis. With head and neck cancer, we study tumor subtypes and biomarkers of prognosis, treatment response, and recurrence. We perform integrative omics analyses, dimension reduction methods, and prediction techniques, with the ultimate goal of identifying patient subsets who would benefit from either an additional targeted treatment or de-escalated treatment to increase quality of life. For regulatory genomics and pathway analysis, we develop statistical tests taking into account important covariates and other variables for weighting observations.
The Aguilar group is focused understanding transcriptional and epigenetic mechanisms of skeletal muscle stem cells in diverse contexts such as regeneration after injury and aging. We focus on this area because there are little to no therapies for skeletal muscle after injury or aging. We use various types of in-vivo and in-vitro models in combination with genomic assays and high-throughput sequencing to study these molecular mechanisms.
Our laboratory focuses on (1) the biology of cancer metastasis, especially bone metastasis, including the role of the host microenvironment; and (2) mechanisms of chemoresistance. We explore for genes that regulate metastasis and the interaction between the host microenvironment and cancer cells. We are performing single cell multiomics and spatial analysis to enable us to identify rare cell populations and promote precision medicine. Our research methodology uses a combination of molecular, cellular, and animal studies. The majority of our work is highly translational to provide clinical relevance to our work. In terms of data science, we collaborate on applications of both established and novel methodologies to analyze high dimensional; deconvolution of high dimensional data into a cellular and tissue context; spatial mapping of multiomic data; and heterogenous data integration.
Our research aims to address fundamental problems in both biomedical research and computer science by developing new tools tailored to rapidly emerging single-cell omic technologies. Broadly, we seek to understand what genes define the complement of cell types and cell states within healthy tissue, how cells differentiate to their final fates, and how dysregulation of genes within specific cell types contributes to human disease. As computational method developers, we seek to both employ and advance the methods of machine learning, particularly for unsupervised analysis of high-dimensional data. We have particular expertise in manifold learning, matrix factorization, and deep learning approaches.