October 11, 2017
RACKHAM BUILDING, 915 E. WASHINGTON ST., ANN ARBOR
(click on title to expand)
In the lobby outside Rackham Auditorium.
Eric Michielssen, Associate Vice President, Advanced Research Computing
|Eric Michielssen is the Associate Vice President, Advanced Research Computing, the Louise Ganiard Johnson Professor of Engineering, and Professor of Electrical Engineering and Computer Science, U-M College of Engineering|
Al Hero and Brian Athey, MIDAS Co-Directors
8:45 a.m. - Daniela Witten, Associate Professor of Statistics and Biostatistics, University of Washington
Title: Statistical Methods for Problems in Biology
Abstract: As the pace and scale of data collection continues to increase across all areas of biology, there is a growing need for effective and principled statistical methods for the analysis of the resulting data. In this talk, I’ll describe two ongoing projects to help fill this gap.
First, calcium imaging data is transforming the field of neuroscience by making it possible to assay the activities of large numbers of neurons simultaneously. For each neuron, the resulting “fluorescence trace” can be seen as a noisy surrogate of its spikes over time. In order to deconvolve a fluorescence trace into the underlying spike times, we consider an auto-regressive model for calcium dynamics. This leads naturally to a seemingly intractable optimization problem. I will show that it is in fact possible to efficiently solve this optimization problem for the global optimum, leading to substantial improvements over competing approaches.
Second, across many areas of biology, it is becoming increasingly common to collect “multi-view data”: that is, data in which multiple data types (e.g. gene expression, DNA sequence, clinical measurements) have been measured on a single set of observations (e.g. patients). We will consider the following question: given a set of n observations with measurements on L data types, can a single clustering of the n observations be defined on all L data types, or does each data type have its own clustering of the observations? To answer this question, I will introduce a general framework for modeling multi-view data, as well as hypothesis tests that can be used in order to characterize the extent to which the clusterings on each of the L data types are the same or different.
Abstract: Function words include pronouns, prepositions, articles, and other common but almost-invisible words in most languages. Their use often signals speakers’ relationships with their audience, their subject matter, and how the speakers think and feel about themselves. Multiple studies find function words can reveal personality, emotional state, deception, status, and thinking styles. Implications for research in medicine, marketing, law, education, literature, and other disciplines are discussed. Oh yes, language use is also relevant to recent and historical political trends.
Team members from MIDAS supported research projects will present:
- Anna Gilbert, Michigan Center for Single-Cell Genomic Data Analytics
- Carol Flannagan, Building a Transportation Data Ecosystem
- Rada Mihalcea, LEAP: Analytics for Learners As People
- Trivellore Raghunathan, A Social Science Collaboration for Research on Communication and Learning Based upon Big Data
- Srijan Sen, Identifying Real-Time Data Predictors of Stress and Depression Using Mobile Technology
Box lunches are available for those who made a selection during the registration process.
The poster session includes data science research presentations from students, faculty and staff of the University of Michigan.
Students participating in the poster session will compete for eleven (11) awards which carry cash prizes. Awards will be announced at 5:30pm in Rackham Auditorium. Winning posters will be displayed at Weiser Hall during the Open House and Reception (6pm).
Awards will be made for:
- Most Innovative Use of Data
- Most Likely Societal Impact
- Most Interesting Methodological Advances
- Most Likely Transformative Scientific Impact
- Most Likely Health Impact
- Best Overall
1:30 p.m. - MIDAS-UMOR-Dissonance Keynote: Cathy O'Neil, data scientist and author of ``Weapons of Math Destruction``
Abstract: We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated.
But as Cathy O’Neil reveals, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his zip code), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.
Tracing the arc of a person’s life, O’Neil exposes the black box models that shape our future, both as individuals and as a society. These “weapons of math destruction” score teachers and students, sort résumés, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health.
O’Neil calls on modelers to take more responsibility for their algorithms and on policy makers to regulate their use. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.
2:30 p.m. - Francesca Dominici, Professor of Biostatistics, Harvard T.H. Chan School of Public Health
Title: Can Data Science Save the Environment?
Abstract: What if I told you I had bulletproof evidence of a serious threat to American national security – a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. The accuracy of my numbers is more conclusive than any CIA or FBI intelligence on international or domestic terrorist activity. This is the question before us today – but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution. We have developed an artificial neural network model that uses on-the- ground air-monitoring data and satellite-based measurements to estimate daily pollution levels across the continental U.S., breaking the country up into 1-square- kilometer zones. We have paired that information with health data contained in Medicare claims records from the last 12 years, and for 97% of the population ages 65 or older. We have developed statistical methods and computational efficient algorithms for the analysis over 460 million health records. Our research shows that air pollution is killing 12,000 senior citizens prematurely each year. That’s the equivalent of a jumbo jet crashing every 12 days. This data science platform is telling us that federal limits on the nation’s most widespread air pollutants are not stringent enough.
Abstract: Internet connectivity has fundamentally changed our lives. The data we collect and analyze presents new opportunities and has the potential to transform (and is transforming) various application domains. However, it is also clear that the persistence of connectivity has created a number of challenges. My claim is that many of those challenges can be attributed to the fact that technological advancement has often occurred in computer science disciplinary silos. Stronger collaboration between the computer science research community and all other disciplines will help mitigate these challenges. We have integrated the principles of interdisciplinarity at ASU’s Global Security Initiative – both from perspectives of organizational design and research approaches.
Representatives from four companies will discuss data science opportunities and challenges facing their companies and their industries.
- Didi Chuxing: Henry Liu
- Mighty AI: Sheikh Shuvo
- Northrop Grumman: Brian Letort
- TD Ameritrade: Beaumont Vance
The symposium reception will take place in the new MIDAS location in Weiser Hall at 500 Church Street, suite 600. Light refreshments will be provided.