Akbar Waljee

By |

I use machine-learning techniques to implement decision support systems and tools that facilitate more personalized care for disease management and healthcare utilization to ultimately deliver efficient, effective, and equitable therapy for chronic diseases. To test and advance these general principles, I have built operational programs that are guiding—and improving—patient care in costly in low resource settings, including emerging countries.

Walter Dempsey

By |

Dr. Dempsey’s research focuses on statistical methods for digital and mobile health. My current work involves three complementary research themes: (1) experimental design and data analytic methods to inform multi-stage decision making in health; (2) statistical modeling of complex longitudinal and survival data; and (3) statistical modeling of complex relational structures such as interaction networks. Current directions include (1) integration of sequential multiple assignment randomized trials (SMARTs) and micro-randomized trials (MRTs) and associated causal inference methods; (2) recurrent event analysis in the presence of high-frequency sensor data; and (3) temporal models for, community detection of, and link prediction using complex interaction data.

Halil Bisgin

By |

My research is focused on a wide range of topics from computational social sciences to bioinformatics where I do pattern recognition, perform data analysis, and build prediction models. At the core of my effort, there lie machine learning methods by which I have been trying to address problems related to social networks, opinion mining, biomarker discovery, pharmacovigilance, drug repositioning, security analytics, genomics, food contamination, and concussion recovery. I’m particularly interested in and eager to collaborate on cyber security aspect of social media analytics that includes but not limited to misinformation, bots, and fake news. In addition, I’m still pursuing opportunities in bioinformatics, especially about next generation sequencing analysis that can be also leveraged for phenotype predictions by using machine learning methods.

A typical pipeline for developing and evaluating a prediction models to identify malicious Android mobile apps in the market

Sunghee Lee

By |

My research focuses on issues in data collection with hard-to-reach populations. In particular, she examines 1) nontraditional sampling approaches for minority or stigmatized populations and their statistical properties and 2) measurement error and comparability issues for racial, ethnic and linguistic minorities, which also have implications for cross-cultural research/survey methodology. Most recently, my research has been dedicated to respondent driven sampling that uses existing social networks to recruit participants in both face-to-face and Web data collection settings. I plan to expand my research scope in examining representation issues focusing on the racial/ethnic minority groups in the U.S. in the era of big data.

Christopher Fariss

By |

My core research focuses on the politics and measurement of human rights, discrimination, violence, and repression. I use computational methods to understand why governments around the world torture, maim, and kill individuals within their jurisdiction and the processes monitors use to observe and document these abuses. Other projects cover a broad array of themes but share a focus on computationally intensive methods and research design. These methodological tools, essential for analyzing data at massive scale, open up new insights into the micro-foundations of state repression and the politics of measurement.

People rely more on strong ties for job help in countries with greater inequality. Coefficients from 55 regressions of job transmission on tie strength are compared to measures of inequality (Gini coefficient), mean income per capita, and population, all measured in 2013. Gray lines indicate 95% confidence regions from 1000 simulated regressions that incorporate uncertainty in the country-level regressions (see below for more details). In each simulated regression we draw each country point from the distribution of regression coefficients implied by the estimate and standard error for that country and measure of tie strength. P values indicate the simulated probability that there is no relationship between tie strength and the other variable. Laura K. Gee, Jason J. Jones, Christopher J. Fariss, Moira Burke, and James H. Fowler. “The Paradox of Weak Ties in 55 Countries” Journal of Economic Behavior & Organization 133:362-372 (January 2017) DOI:10.1016/j.jebo.2016.12.004

Fadhl Alakwaa

By |

Alzheimer’s disease (AD) afflicts more than 5 million people in the United States and is gaining widespread attention. Over 400 clinical trials were run between 2002 and 2012, but only one trial has resulted in a marketable product. One of the most common explanations for these failures is likely the consideration of Alzheimer’s as a homogeneous disease. In many cases, individuals within the same group respond to a drug in different ways. Given the highly complex nature of AD, the likelihood of identifying a single drug to provide meaningful benefits to every patient is minimal. There is a pressing and unmet need to develop personalized treatment plans based on each patients’ omics profiles.
To solve this problem, my research focus is to develop a data-driven computational approach to predict drug responses for individuals with AD. This approach is based on the patients’ metabolomics and transcriptomics profile and publicly available drug databases. Transcriptomics and metabolomics are increasingly being used to corroborate our interpretation of the pathophysiological pathways underlying AD. Integration of metabolomics and transcriptomics will guide the development of precision medicine for AD. In particular, I used the metabolome and transcriptome profiles of Alzheimer’s patients from ADNI database. For each patient, I identify his/her dysregulated pathways from their metabolome profiles and his/her specific gene regulatory network from their transcriptome profiles. My preliminary data suggested that each patient with Alzheimer’s has distinct dysregulated pathways and gene regulatory network. Drug selection based on a patient’s specific metabolome and transcriptome profiles offers a tremendous opportunity for more targeted and effective disease treatment and it represents a critical innovation towards personalized medicine for AD. My long-term goal is to become an independent investigator in computational biology with a focus on translating omics data to bedside application. The overall objective of my research is to combine metabolomics and gene expression data with drug data using advanced machine learning algorithms to personalize medicine for AD.

Andrei Boutyline

By |

Cultural systems are fundamentally structural phenomena, defined by patterns of relations between elements of public representations and individual behaviors and cognitions. However, because such systems are difficult to capture with traditional empirical approaches, they usually remain understudied. In my work, I draw on network analysis, statistics, and computer science to create novel approaches to such analyses, and on cognitive science to theorize the objects of these investigations. Broader questions that interest me are: how are different cultural elements interrelated with one another? What is the relationship between public cultural representations and individual cognition and behavior? And how can we capture the structure of these interrelationships across large social and time scales? Methodologically, I am currently focused no developing applications of word embeddings and other natural language processing methods to sociological questions about cultural change.

Changing gender connotations of intelligence and studiousness throughout the latter half of the 20th century measured using word embeddings. Intelligence gained a masculine gender coding just as studiousness gained a masculine one. Scores are z-scored average cosine similarities between sets of keywords and a gender dimension. Data source: Corpus of Historical American English.

Lili Zhao

By |

I have broad interests and expertise in developing statistical methodology and applying it in biomedical research. I have adapted methodologies, including Bayesian data analysis, categorical data analysis, generalized linear models, longitudinal data analysis, multivariate analysis, RNA-Seq data analysis, survival data analysis and machine learning methods, in response to the unique needs of individual studies and objectives without compromising the integrity of the research and results. Two main methods recently developed:
1) A risk prediction model for a survival outcome using predictors of a large dimension
I have develop a simple, fast yet sufficiently flexible statistical method to estimate the updated risk of renal disease over time using longitudinal biomarkers of a high dimension. The goal is to utilize all sources of data of a large dimension (e.g., routine clinical features, urine and serum markers measured at baseline and all follow-up time points) to efficiently and accurately estimate the updated ESRD risk.
2) A safety mining tool for vaccine safety study
I developed an algorithm for vaccine safety surveillance while incorporating adverse event ontology. Multiple adverse events may individually be rare enough to go undetected, but if they are related, they can borrow strength from each other to increase the chance of being flagged. Furthermore, borrowing strength induces shrinkage of related AEs, thereby also reducing headline-grabbing false positives.

Neil Carter

By |

Carter’s research combines quantitative, theoretical, and field approaches to address challenging local to global wildlife conservation issues in the Anthropocene. His work includes projects on endangered species conservation in human-dominated areas of Nepal, post-war recovery of wildlife in Mozambique, human-wildlife coexistence in the American West, and the effects of artificial lights and human-made noise on wildlife habitat across the contiguous US. Research methods focus on: (1) spatializing both human and wildlife processes, (2) probabilistic methods to infer human-wildlife interactions (3) simulation models of coupled natural-human systems, and (4) forecasting and decision-support tools.