Jonathan Terhorst, PhD Candidate
Statistics, University of California at Berkeley
“Robust and Scalable Inference of Population History and Selection
from Hundreds of Whole Genomes”
Abstract: Demographic inference refers to the problem of inferring past population events (migrations, admixture, expansions, etc.) from patterns of mutations in sampled DNA. Apart from intrinsic appeal of understanding the origins of our species, this type of analysis is useful for forming a null model of human evolution, departures from which signal the presence of natural selection, population structure, and other interesting phenomena.
In this talk I will discuss recent statistical and computational innovations which enable us to infer demography using modern data sets consisting of hundreds of whole-genome sequences obtained from populations all over the world. These include momi, a new software package for stable and rapid computation of the expected sample frequency spectrum (SFS) under complex demographic scenarios involving numerous diverged populations, as well as SMC++, a new probabilistic framework which couples the genealogical process for a given individual with allele frequency information from a large related panel. Using these tools, I will demonstrate how we can learn about human expansion in the last 12,000 years, understand the mysterious origins of ancient DNA samples, and estimate when Europeans acquired lighter skin and the ability to digest lactose. Finally, I will discuss some of the statistical aspects of these estimators, in particular an information-theoretic lower bound on the error rate of any SFS-based demographic inference procedure.
All relevant theory will be introduced during the talk; no prior knowledge of population genetics is assumed. Portions of this work are joint with Jack Kamm, Pier Palamara, and Yun Song.
Bio: I am a PhD student in the statistics department at UC Berkeley. I’m interested in statistical / population genetics, machine learning, and generally developing mathematical models and software to help fellow scientists understand their data.
Light refreshments will be served at 3:10 p.m. in room 1690.