Bloomberg Conference Accepts Both MDST Papers!

By | MDSTPosts | No Comments

Earlier this summer, MDST submitted two papers to the Bloomberg Data For Good Exchange conference regarding our work on the Flint Water Crisis and with the University Musical Society respectively. It is my great pleasure to announce that the conference has elected both of our papers for presentation at the conference in New York on September 25th!

Needless to say, we’re all very excited. 🎉

MDST Faculty Advisor Jacob Abernethy Interviewed for Machine Learning Podcast!

By | MDSTPosts | No Comments

Our very own Jacob Abernethy was recently interviewed on the popular machine learning podcast, Talking Machines. Among other things, Jake was asked about his experiences working with the trove of municipal data available in Flint, his path to research at the University of Michigan, and our work with Google and UM-Flint.

You can find a link to the interview here. Fun Fact: Talking Machines is produced by Kathrine Goreman, a UM alumna!

MDST Submits Two Papers to Bloomberg Conference

By | MDSTPosts | No Comments

While we are known for our participation in structured prediction challenges, MDST has picked up at least two community projects in the last year. MDST members of all experience levels got to participate in both our efforts in Flint and our work with UMS’s ticket purchase data. Around the time that we hit milestones in both projects, news of the Bloomberg Data 4 Good Exchange call for papers reached some members of MDST and we decided to take a shot.

The results of our foray into volunteer, remote, academic paper collaboration can be found below in the form of two successfully written MDST papers! We’re incredibly proud of the results and even prouder of our membership, who worked so hard to produce such quality work.

MDST Partners with UM-Flint & Google.org to Aid Locals in Flint Water Crisis

By | MDSTPosts | No Comments

The Michigan Data Science Team is excited to have partnered with Google and the University of Michigan-Flint to engineer a data platform and accompanying app as a part of our continued efforts to help the community of Flint. This app will provide users with information regarding key public services, such as the locations of water bottle distribution centers and instructions to request new water testing kits. Users will also be able to report concerns about the water quality at their location, and access our predictive model, which flags homes that are potentially at high risk of lead contamination.

Google.org is providing the University of Michigan-Flint a grant of $150,000 to build the platform and accompanying app. In addition, they are also providing access to several Google engineering consultants who will aid in producing interactive visualizations and oversee the app’s user interface design. MDST has created a multidisciplinary engineering team to oversee and manage the creation of our predictive model and data platform.

We will continue our efforts to ask and answer the data-related questions surrounding this crisis in order to provide as much value as we can to the people of Flint. We are incredibly grateful for the support from Google and for the chance to collaborate with our friends and fellow researchers at the University of Michigan-Flint campus.

FARS Visualization Challenge

By | MDSTPosts | No Comments

Last week, we held the FARS Dataset Visualization Challenge, where teams were tasked with visualizing more than a decade of fatal traffic accident records to address the question – “What causes drunk driving accidents?”

First prize went to Team Bidiu (Chengyu Dai, Cyrus Anderson, Cupjin Huang, and Wenbo Shen) whose presentation addressed the questions: who is driving drunk, where are they driving, and when do fatal accidents occur? For their first-place finish, each member of Team Bidiu will receive a $25 gift card to Amazon.com! You can view Team Bidiu’s presentation and source code at the team’s Github page.

FARS Dataset Challenge Kickoff

By | MDSTPosts | No Comments

It’s my great pleasure to be announcing the next MDST competition! We are very fortunate to be partnering with the Michigan Institute for Data Science (or ‘MIDAS’) for this event. We will be holding the kickoff meeting this Thursday at 5:00pm in 3150 DOW. Through this partnership, we’ve been able to obtain a particularly interesting dataset. We have compiled records of every fatal car accident reported in the United States between 2003 and 2014, a dataset known as the Fatal Accident Reporting System Dataset, or FARS. The challenge will be to predict whether or not a drunk driver was involved in the accident.

More information about logistics, prizes, and the dataset itself will be given at the kickoff ceremony this Thursday. Additionally, we will be awarding prizes to the winners of our last competition, the RateMyProfessor challenge, so if you won a prize, please show up to claim it.

If you have any further questions, feel free to email me at mdst-coms@umich.edu

Rate My Professor Challenge Winners!

By | MDSTPosts | No Comments

Placement & Prizes

(Final Leaderboard Link)

1st: DaBrain – Guangsha Shi, Sean Ma, Sheng Yang

$200 Amazon Gift Card + MDST T-shirts

2nd: The Data Miners – Alexander Zaitzeff, Ryan Sandberg

$100 Amazon Gift Card + MDST T-shirts

3rd: Arya Farahi

MDST T-shirt

Strategy Discussion

Congratulations are in order to DaBrain, who in the last week of the competition rocketed through the leaderboard, displacing the incumbent finalists and finishing out the competition in first place! DaBrain employed the only neural network algorithm in the competition, leveraging the power of the LSTM architecture for Recurrent Neural Networks against this document-based dataset. LSTM-RNN’s have seen great experimental results on NLP tasks in recent years, making this choice in algorithm particularly exciting for the Rate My Professor Challenge.

Our second place finalists, The Data Miners, lead the leaderboard for many weeks after the beginning of the challenge. The Data Miners managed an impressive number of submissions, more than double that of the next most frequently submitting team. In the end, an ensembling method combining ridge regression, random forest regression, and gradient boosting- as well as some hacky tricks beyond explanation- allowed The Data Miners to seize second place!

Our third place finisher was Arya Farahi, a PhD student of Physics. His final approach was well reasoned, relying on a small family of simple, easily interpreted predictors. Employing Ridge, Lasso, and logistic regressions, as well as a measure of ‘happiness’ described by an academic paper (Dodds, et al 2015), Arya was able to build out a highly transparent, robust model with very little additional tinkering.

Final Words

The MDST administration would like to thank everyone who participated. We deeply appreciate all the time and energy our members put into these competitions. We’ve learned a lot from our first internal competition and we hope you all did as well!

As always, send us your questions, comments, concerns, and suggestions to mdst-exec@umich.edu.

MDST Rate My Professor Challenge Kickoff

By | MDSTPosts | No Comments

The Michigan Data Science Team will be kicking off our second data science competition of the year next week! At this meeting, we will be introducing the competition dataset and giving a live demo of a script to help get you started. You do not need to have participated in the last challenge to compete! Unlike the Springleaf challenge, this will be an entirely internal competition, and will be especially geared towards those new to data science.

  • MDST RateMyProfessors Challenge Kickoff Meeting
  • Date: Nov. 5th (Thursday)
  • Time: 5 – 6 PM
  • Location: Beyster 1690

Add to Google Calendar

This challenge will feature ratings of professors, written by university students. Your task is to infer the numerical rating assigned to a professor by their student from their rating’s text. To be successful, your solutions will need to extract interesting, useful features from this text and apply them effectively. The best solutions will draw inspiration from research in natural language processing, sentiment analysis, and deep learning.

Competition Features:

  • Big prizes!
  • Weekly tutorials and starter scripts
  • Teammate matching
  • Data visualization challenge
  • Flux allocations for all competitors
  • Leaderboard hosted on Kaggle In Class

Springleaf Winners Announced

By | MDSTPosts | No Comments

The Springleaf Marketing Response Challenge is now over! Thank you and congratulations to all the dedicated students who competed. In total, we had over 50 students and 20 teams participate in this competition.

1st Place: $200 Amazon Gift Card + T-Shirts

#34 – Cantseetherandomforestforthetrees

Alexander Zaitzeff and Jared Webb

2nd Place: $100 Amazon Gift Card + T-Shirts

#545 – Physteam

Arya Farahi and Anthony Kremin

3rd Place: T-Shirts

#552 – GGBrown

Xiang Li, Xinyu Tan, Tianpei Xie, and Jianming Sang


We graciously acknowledge Soartech for funding for the MDST Springleaf Challenge.