October 11, 2017

(click on title to expand)

7:30 a.m. - Check-in and Coffee

In the lobby outside Rackham Auditorium.

8:30 a.m. - Welcome

Eric Michielssen, Associate Vice President, Advanced Research Computing

Eric Michielssen is the Associate Vice President, Advanced Research Computing, the Louise Ganiard Johnson Professor of Engineering, and Professor of Electrical Engineering and Computer Science, U-M College of Engineering

Al Hero and Brian Athey, MIDAS Co-Directors

athey_brian-bestBrian Athey is the Michael A. Savageau Collegiate Professor and Chair of the Department of Computational Medicine and Bioinformatics, and Professor of Psychiatry and Internal Medicine. HeroJan2010Al Hero is the John H. Holland Distinguished University Professor of Electrical Engineering and Computer Science, R. Jamison and Betty Williams Professor of Engineering, Professor of Biomedical Engineering, and Professor of Statistics

8:45 a.m. - Daniela Witten, Associate Professor of Statistics and Biostatistics, University of Washington

daniela-headshot-smallTitle: Statistical Methods for Problems in Biology

Abstract: As the pace and scale of data collection continues to increase across all areas of biology, there is a growing need for effective and principled statistical methods for the analysis of the resulting data. In this talk, I’ll describe two ongoing projects to help fill this gap.

First, calcium imaging data is transforming the field of neuroscience by making it possible to assay the activities of large numbers of neurons simultaneously. For each neuron, the resulting “fluorescence trace” can be seen as a noisy surrogate of its spikes over time. In order to deconvolve a fluorescence trace into the underlying spike times, we consider an auto-regressive model for calcium dynamics. This leads naturally to a seemingly intractable optimization problem. I will show that it is in fact possible to efficiently solve this optimization problem for the global optimum, leading to substantial improvements over competing approaches.

Second, across many areas of biology, it is becoming increasingly common to collect “multi-view data”: that is, data in which multiple data types (e.g. gene expression, DNA sequence, clinical measurements) have been measured on a single set of observations (e.g. patients). We will consider the following question: given a set of n observations with measurements on L data types, can a single clustering of the n observations be defined on all L data types, or does each data type have its own clustering of the observations? To answer this question, I will introduce a general framework for modeling multi-view data, as well as hypothesis tests that can be used in order to characterize the extent to which the clusterings on each of the L data types are the same or different.

9:45 a.m. - James Pennebaker, Professor of Psychology, University of Texas at Austin

Analyzing Words to Understand People

Abstract: Function words include pronouns, prepositions, articles, and other common but almost-invisible words in most languages. Their use often signals speakers’ relationships with their audience, their subject matter, and how the speakers think and feel about themselves.  Multiple studies find function words can reveal personality, emotional state, deception, status, and thinking styles. Implications for research in medicine, marketing, law, education, literature, and other disciplines are discussed.  Oh yes, language use is also relevant to recent and historical political trends.


10:45 a.m. - MIDAS Research Initiatives Panel

Team members from MIDAS supported research projects will present:

  • Anna Gilbert, Michigan Center for Single-Cell Genomic Data Analytics
  • Carol Flannagan, Building a Transportation Data Ecosystem
  • Rada Mihalcea, LEAP: Analytics for Learners As People
  • Trivellore Raghunathan, A Social Science Collaboration for Research on Communication and Learning Based upon Big Data
  • Srijan Sen, Identifying Real-Time Data Predictors of Stress and Depression Using Mobile Technology

Noon - Lunch & Poster Session @ Michigan League

Box lunches are available for those who made a selection during the registration process.

The poster session includes data science research presentations from students, faculty and staff of the University of Michigan.

Students participating in the poster session will compete for eleven (11) awards which carry cash prizes.  Awards will be announced at 5:30pm in Rackham Auditorium.  Winning posters will be displayed at Weiser Hall during the Open House and Reception (6pm).

Awards will be made for:

  • Most Innovative Use of Data
  • Most Likely Societal Impact
  • Most Interesting Methodological Advances
  • Most Likely Transformative Scientific Impact
  • Most Likely Health Impact
  • Best Overall

1:30 p.m. - MIDAS-UMOR-Dissonance Keynote: Cathy O'Neil, data scientist and author of ``Weapons of Math Destruction``

Cathy O'Neil HeadshotTitle: Weapons of Math Destruction

Abstract: We live in the age of the algorithm. Increasingly, the decisions that affect our lives—where we go to school, whether we get a car loan, how much we pay for health insurance—are being made not by humans, but by mathematical models. In theory, this should lead to greater fairness: Everyone is judged according to the same rules, and bias is eliminated.

But as Cathy O’Neil reveals, the opposite is true. The models being used today are opaque, unregulated, and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination: If a poor student can’t get a loan because a lending model deems him too risky (by virtue of his zip code), he’s then cut off from the kind of education that could pull him out of poverty, and a vicious spiral ensues. Models are propping up the lucky and punishing the downtrodden, creating a “toxic cocktail for democracy.” Welcome to the dark side of Big Data.

Tracing the arc of a person’s life, O’Neil exposes the black box models that shape our future, both as individuals and as a society. These “weapons of math destruction” score teachers and students, sort résumés, grant (or deny) loans, evaluate workers, target voters, set parole, and monitor our health.

O’Neil calls on modelers to take more responsibility for their algorithms and on policy makers to regulate their use. But in the end, it’s up to us to become more savvy about the models that govern our lives. This important book empowers us to ask the tough questions, uncover the truth, and demand change.

2:30 p.m. - Francesca Dominici, Professor of Biostatistics, Harvard T.H. Chan School of Public Health


Title: Can Data Science Save the Environment?

Abstract: What if I told you I had bulletproof evidence of a serious threat to American national security – a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. The accuracy of my numbers is more conclusive than any CIA or FBI intelligence on international or domestic terrorist activity. This is the question before us today – but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution. We have developed an artificial neural network model that uses on-the- ground air-monitoring data and satellite-based measurements to estimate daily pollution levels across the continental U.S., breaking the country up into 1-square- kilometer zones. We have paired that information with health data contained in Medicare claims records from the last 12 years, and for 97% of the population ages 65 or older. We have developed statistical methods and computational efficient algorithms for the analysis over 460 million health records. Our research shows that air pollution is killing 12,000 senior citizens prematurely each year. That’s the equivalent of a jumbo jet crashing every 12 days. This data science platform is telling us that federal limits on the nation’s most widespread air pollutants are not stringent enough.

3:30 p.m. - Nadya T. Bliss, Director, Global Security Initiative, Arizona State University

nadya_bliss_image-291x300Title: Computer Science and  _____  : Better Together; The Value of an Interdisciplinary Approach

Abstract: Internet connectivity has fundamentally changed our lives. The data we collect and analyze presents new opportunities and has the potential to transform (and is transforming) various application domains. However, it is also clear that the persistence of connectivity has created a number of challenges. My claim is that many of those challenges can be attributed to the fact that technological advancement has often occurred in computer science disciplinary silos. Stronger collaboration between the computer science research community and all other disciplines will help mitigate these challenges. We have integrated the principles of interdisciplinarity at ASU’s Global Security Initiative – both from perspectives of organizational design and research approaches.

4:45 p.m. - Challenges & Opportunities for Industry

Representatives from four companies will discuss data science opportunities and challenges facing their companies and their industries.

  • Didi Chuxing: Henry Liu
  • Mighty AI: Sheikh Shuvo
  • Northrop Grumman: Brian Letort
  • TD Ameritrade: Beaumont Vance

6:00 p.m. - Reception & MIDAS Open House @ Weiser Hall

The symposium reception will take place in the new MIDAS location in Weiser Hall at 500 Church Street, suite 600. Light refreshments will be provided.