sccopy1

Sriram Chandrasekaran

By | | No Comments

Sriram Chandrasekaran, PhD, is Assistant Professor of Biomedical Engineering in the College of Engineering at the University of Michigan, Ann Arbor.

Dr. Chandrasekaran’s Systems Biology lab develops computer models of biological processes to understand them holistically. Sriram is interested in deciphering how thousands of proteins work together at the microscopic level to orchestrate complex processes like embryonic development or cognition, and how this complex network breaks down in diseases like cancer. Systems biology software and algorithms developed by his lab are highlighted below and are available at http://www.sriramlab.org/software/.

– INDIGO (INferring Drug Interactions using chemoGenomics and Orthology) algorithm predicts how antibiotics prescribed in combinations will inhibit bacterial growth. INDIGO leverages genomics and drug-interaction data in the model organism – E. coli, to facilitate the discovery of effective combination therapies in less-studied pathogens, such as M. tuberculosis. (Ref: Chandrasekaran et al. Molecular Systems Biology 2016)

– GEMINI (Gene Expression and Metabolism Integrated for Network Inference) is a network curation tool. It allows rapid assessment of regulatory interactions predicted by high-throughput approaches by integrating them with a metabolic network (Ref: Chandrasekaran and Price, PloS Computational Biology 2013)

– ASTRIX (Analyzing Subsets of Transcriptional Regulators Influencing eXpression) uses gene expression data to identify regulatory interactions between transcription factors and their target genes. (Ref: Chandrasekaran et al. PNAS 2011)

– PROM (Probabilistic Regulation of Metabolism) enables the quantitative integration of regulatory and metabolic networks to build genome-scale integrated metabolic–regulatory models (Ref: Chandrasekaran and Price, PNAS 2010)

 

Research Overview: We develop computational algorithms that integrate omics measurements to create detailed genome-scale models of cellular networks. Some clinical applications of our algorithms include finding metabolic vulnerabilities in pathogens (M. tuberculosis) using PROM, and designing multi combination therapeutics for reducing antibiotic resistance using INDIGO.

Research Overview: We develop computational algorithms that integrate omics measurements to create detailed genome-scale models of cellular networks. Some clinical applications of our algorithms include finding metabolic vulnerabilities in pathogens (M. tuberculosis) using PROM, and designing multi combination therapeutics for reducing antibiotic resistance using INDIGO.

zhu-small

Qiang Zhu

By | | No Comments

Dr. Zhu’s group conducts research on various topics, ranging from foundational methodologies to challenging applications, in data science. In particular, the group has been investigating the fundamental issues and techniques for supporting various types of queries (including range queries, box queries, k-NN queries, and hybrid queries) on large datasets in a non-ordered discrete data space. A number of novel indexing and searching techniques that utilize the unique characteristics of an NDDS are developed. The group has also been studying the issues and techniques for storing and searching large scale k-mer datasets for various genome sequence analysis applications in bioinformatics. A virtual approximate store approach to supporting repetitive big data in genome sequence analyses and several new sequence analysis techniques are suggested. In addition, the group has been researching the challenges and methods for processing and optimizing a new type of so-called progressive queries that are formulated on the fly by a user in multiple steps. Such queries are widely used in many application domains including e-commerce, social media, business intelligence, and decision support. The other research topics that have been studied by the group include streaming data processing, self-management database, spatio-temporal data indexing, data privacy, Web information management, and vehicle drive-through wireless services.

zhu-image

Using a data-partitioning based index tree (the ND-tree) to find sequences that are similar (with distance 1) to a given query sequence from a large sequence database in a Non-ordered Discrete Data Space (NDDS).

ssmith-small

Stephen Smith

By | | No Comments

The Smith lab group is primarily interested in examining evolutionary processes using new data sources and analysis techniques. We develop new methods to address questions about the rates and modes of evolution using the large data sources that have become more common in the biological disciplines over the last ten years. In particular, we use DNA sequence data to construct phylogenetic trees and conduct additional analyses about processes of evolution on these trees. In addition to this research program, we also address how new data sources can facilitate new research in evolutionary biology. To this end, we sequence transcriptomes, primarily in plants, with the goal of better understanding where, within the genome and within the phylogeny, processes like gene duplication and loss, horizontal gene transfer, and increased rates of molecular evolution occur.

A rough draft of the first comprehensive tree of life, showing the links between all of the more than 2.3 million named species of animals, plants and microorganisms. The draft was constructed by combining more than 450 existing trees to a comprehensive taxonomy. Because the tree is large, only lineages with at least 500 species are shown. The colors correspond to the amount of publicly available DNA data for each lineage (red = high, blue = low, giving an idea of the amount of available information).

A rough draft of the first comprehensive tree of life, showing the links between all of the more than 2.3 million named species of animals, plants and microorganisms. The draft was constructed by combining more than 450 existing trees to a comprehensive taxonomy. Because the tree is large, only lineages with at least 500 species are shown. The colors correspond to the amount of publicly available DNA data for each lineage (red = high, blue = low, giving an idea of the amount of available information).

vershynin-small

Roman Vershynin

By | | No Comments

Prof. Vershynin’s main area of expertise is high dimensional probability and its applications. He is interested in random geometric structures that appear in various data science problems. The following is a sample of his recent projects: 1. High dimensional inference from nonlinear data Sometimes we are given certain observations of an unknown vector that encodes useful but hidden information, and we want to compute that vector. Examples includes compressed sensing, linear and non-linear regression, as well as binary (yes-no) observations. We are developing methods that can estimate the hidden vector without even knowing the nature of the non-linearity of observations. Areas of application include survey methodologies, signal processing, and various high-dimensional classification problems. 2. Structure mining in networks Complex data sets such as networks often have latent structures, for example clusters or communities. We are interested in developing efficient methods to discover such latent structures. Prof. Vershynin’s methods come from various areas of mathematics and data science, including random matrix theory, geometric functional analysis, convex and discrete geometry, geometric combinatorics, high dimensional statistics, information theory, learning theory, signal processing, theoretical computer science and numerical analysis.

A method based on semidefinite programming reveals the structure of the social network of bottleneck dolphins (left) by enhancing two communities (right) [Le & Vershynin, 2015]

A method based on semidefinite programming reveals the structure of the social network of bottleneck dolphins (left) by enhancing two communities (right) [Le & Vershynin, 2015]

long_nguyen_2012

Long Nguyen

By | | No Comments

I am broadly interested in statistical inference, which is informally defined as the process of turning data into prediction and understanding. I like to work with richly structured data, such as those extracted from texts, images and other spatiotemporal signals. In recent years I have gravitated toward a field in statistics known as Bayesian nonparametrics, which provides a fertile and powerful mathematical framework for the development of many computational and statistical modeling ideas. My motivation for all this came originally from an early interest in machine learning, which continues to be a major source of research interest. A primary focus of my group’s research in machine learning to develop more effective inference algorithms using stochastic, variational and geometric viewpoints.

yves

Yves Atchade

By | | No Comments

My current research explores the possibilities and limits of Markov Chain Monte Carlo (MCMC) methods in dealing with posterior or quasi-posterior distributions that arise from high-dimensional Bayesian (or quasi-Bayesian) inference in regression and graphical models. I also have some interests in optimization, and these revolve around the use of stochastic methods: whether (and how) the use of stochastic methods can help tackle large scale optimization problems of interest in statistics. I also have interests in the use of remote sensing data to study social and environmental issues in Africa.