February 3, 2022
Despite the power of modern research methodology, some fundamental questions are still eluding biologists, including the specificity, classification, and analysis of cells. Recent advances are starting to make analysis of individual cells possible, unlocking all sorts of possibilities both in biology and with implications for a variety of fields of study.
The Michigan Center for Single-Cell Genomic Data Analytics was started by a MIDAS Challenge Award. MIDAS Affiliate faculty member and co-principal investigator, Dr. Jun Li (Professor, Human Genetics and Computational Medicine and Bioinformatics) leads a large, multidisciplinary team of faculty from mathematics, statistics, engineering, biology, and medical research. Leveraging the seed money as a jumping off point, the team has carried out a cluster of research projects, and has gone on to secure $30 million external grants in less than 3 years’ time from the National Institutes of Health, the National Science Foundation, Chan-Zuckerberg Initiative and Open Philanthropy Project.
Equally importantly, this team spurred a cross-pollinating effect through research events (2018 winter retreat, 2018 symposium, 2017 symposium), collaboration and consultation, and helped to build a strong presence of single-cell biology at U-M. Dozens of other U-M research teams have now adopted the single-cell data science methodologies in their own research. Recently, the team members and their collaborators have also started the Single Cell Spatial Analysis Program with $7 million of funding from the university’s Biosciences Initiative. This is a perfect example of how data science can transform a traditional area of research.
The ability to focus on a single cell in a sample is revolutionary, previously the smallest samples would contain a mix of many cell types – forcing scientists to make assumptions about what each cell is doing and how it is “living its life:” Has it multiplied? Changed type? Is it mutating? Being able to hone in on samples of individual cells allows for a better understanding and identification of what allows the human body to regulate growth as well as the causes and efficacy of treatments for irregular growths such as in cancer. However, because a single cell contains only a small amount of genetic material, fragments of the genome are missing in analysis which can lead to difficulties in assembling reliable genomic information. To address this “sparse read counts data” problem, data scientists and mathematicians on the team have worked in tandem with the biological researchers to implement new methodologies in sparse data analysis. These methodologies deal with issues in data normalization, batch effect detection and correction, marker selection, classification, rare class identification, differential expression, network and phylogenetic inference. With such methodologies, the team has made significant progress to address four biological questions: 1) Intra-tumor heterogeneity, cancer stem cells in metastasis and treatment resistance, cancer genome evolution; 2) spermatogenesis as a model for cell fate determination during development; 3) transcriptional complexity and gene regulation at the single-cell level; 4) molecular changes at the single-cell level as a result of environmental exposures and windows of susceptibility.