MIDAS organizes many events throughout the year specifically geared towards students’ technical skill development, job search and career preparation, and engagement with industry professionals in the data science field.
Research Projects
For MIDAS Affiliated Faculty:
If you are a faculty member and would like to submit a research project you are looking for student assistance/collaboration on please submit the below form:
The following faculty members are seeking student research assistants (click name to expand):
Murali Mani (Professor of Computer Science, University of Michigan, Flint)
Project Summary
Seeking students for 2 main projects:
 Secure query processing that I discussed as part of MIDAS faculty pitch. For this, I am looking for students with a background in cryptography and/or database systems/algorithms.
 Data set search where we are investigating what datasets could be investigated further to answer a user question. For this, I am looking for students interested in database systems/algorithms/IR.
Responsibilities
Both projects will require literature analysis. Students will also involve algorithm development, prototype building, evaluation and analysis.
Required Experience
Having completed an undergraduate course in data structures and algorithms are required. Additional course work in database systems or IR (for the dataset search project) or cryptography (for the secure query processing project) would be helpful.
Compensation
Students may register for independent research credit.
Contact
mmani@umich.edu
Date Posted
10/1/2020
Ivo Dinov (Professor, Computational Medicine and Bioinformatics, University of Michigan)
Project Summary
Please see the SOCR Lab projects (computing, data science, health analytics, mathematical physics, bioinformatics, statistical inference):
https://socr.umich.edu/docs/uploads/2021/SOCR_MDP_2021_Projects.pdf
https://wiki.socr.umich.edu/index.php/Available_SOCR_Development_Projects
Responsibilities
Students will develop math models, build computational tools & apps, fit stat inference models, analyze data, design AI/ML forecasting and prediction models.
Required Experience
Extremely strong motivation/selfdrive, ability to work in a transdisciplinary team, deep scientific background.
Compensation
Students may register for independent research credit.
Contact
dinov@umich.edu
Date Posted
10/1/2020
Greg Rybarczyk (Associate Professor, University of MichiganFlint)
Project Summary
The projects I am currently investigating include: human travel behavior and attitudes when bicycling, walking, and utilizing “third spaces” pre and post Covid. I am also interested in forecasting bicycleshare use and bicycling crash rates in response to urban design. Other related, but important projects currently under consideration include: urban foraging behavior and public health, senior citizen response to green/blue space during and after a pandemic.
Responsibilities
Students will assist with data collection/cleaning, survey development, analysis, mapping, literature reviews, and writing.
Required Experience
Graduate student standing, experience with Python programming, data science, crowdsource data, public health modeling, predictive modeling, sentiment analysis, mixedmethods, GIS, survey development, or spatial analysis.
Compensation
Hourly Pay is Available, Students may register for independent research credit.
Contact
grybar@umich.edu
Date Posted
10/27/2020
Somangshu Mukherji (Assistant Professor, Music Theory, SMTD)
Project Summary
The Telemann Chorale Project – To develop some models for automatically composing Baroque chorales, using ideas from music theory, generative linguistics and computer science, and preparing a new edition of Telemann’s 430 chorales as a result.
Responsibilities
Students will assist with developing new computational frameworks for music composition, natural language processing, and optical score recognition, music theory and linguistics literature analysis, writing code
Required Experience
Prior coursework in music theory and/or generative linguistics, interest in computational modeling of music, some experience preferable in musicrelated coding languages such as music21.
Compensation
Students may register for independent research credit
Contact
somangsh@umich.edu
Date Posted
11/2/2020
Mathematics Research Institution Coding and Disambiguation
Background
Mathematical Reviews (MR) is a division of the American Mathematical Society. Since 1940, MR has been collecting data on the research literature in mathematics. From the beginning, institutional affiliations of authors have been collected, primarily as an aid in distinguishing authors with similar names. However, institutional identities are themselves susceptible to ambiguities, especially since organizations are identified down to the level of departments, not just universities or colleges. MR follows the guidelines of descriptive cataloging, meaning that the information from the paper or book is used, rather than matching department names to an existing list or authority.
Authors are not consistent in naming their own departments or institutions. It is possible for one department to be called by several names, even by the same author. At the same time, at other universities, there may be separate departments, with “Mathematics Department” being pure math and “Department of Mathematical Sciences” being applied math. Finally, political divisions such as countries are not stable, i.e., countries can split, merge, or just change names.
Dataset
The dataset consists of a table of 207,334 active Institution Codes. Codes generally follow the pattern ABBCCC, where A is one or more character for the country, BB is two or more characters that represent the university or other larger organization, and CCC represents the unit at the level of a department. The dataset also includes inactive codes: 22,331 codes identified as invalid and 10,212 codes that have been retired. There are separate tables for locations (cities), states, and countries.
Research Prompts:
 Identify possible duplications in the set of institution codes.
 Devise a method for resolving duplications. Solution methods may include scripted querying of the database via MathSciNet, the user interface to the Mathematical Reviews Database.
 Create a database of the disambiguated codes that includes tables for countries, cities, primary institutions (university, college), and departmentlevel units.
Required Coursework and Skills
Participating students must have taken EECS 484 (Database Management Systems) or its equivalent and be experienced in programming in Python or other generalpurpose programming language.
Number of Participants
One team of up to four (4) undergraduate or graduate students will be chosen to work on this project.
Contact
If interested, please send your CV to Jonathan Gryak (gryakj@med.umich.edu), Senior Scientist at MIDAS.
Date Posted
11/13/2020
Mathematics Research Collaboration Graph 2020
Background
Mathematical Reviews (MR) is a division of the American Mathematical Society. Since 1940, Mathematical Reviews has been collecting data on research literature in mathematics. The collected authorship data are remarkably accurate (though not perfect). The Mathematics Research Collaboration Graph has the authors from mathematics publications as its vertices, and two vertices are joined by an edge if the corresponding authors have published an article together. MathSciNet contains information for over one million authors and nearly four million publications. Understanding the mathematical structure of the collaboration graph provides interesting and useful information about mathematics as a profession.
Research Prompts:
 In this graph, what is the average degree of a vertex? In other words, what is the average number of coauthors? What are other parameters for the graph?
 If all authors with no coauthors are eliminated, what is the average degree of a vertex in the resulting graph? What are other parameters for this subgraph?
 Are there any Super Collaborators, i.e., authors with a significantly large number of coauthors?
 Continuing from Question 3: for any Super Collaborator, consider the subgraph of the Collaboration Graph consisting of this author, their collaborators, the collaborators of their collaborators, the collaborators of the collaborators of their collaborators, and so on.


 How big is this subgraph?
 What is the average degree of a vertex in this subgraph?
 What is the maximum distance between any two coauthors of this subgraph?
 What happens to this subgraph if the Super Collaborator is removed? Is the subgraph still connected? If not, how many components does it have? What is the distribution of sizes of these components?

 Find the connected components of the Collaboration Graph, and in particular, compare the two largest components. What structural differences are there between these two components? Is the largest component one of the subgraphs found in Question 4? Are there any authors whose removal results in two disconnected subgraphs of comparable size?
 Some measures of centrality in social networks include betweenness, closeness, and eigencentrality. Investigate these and other parameters for the largest component of the Collaboration Graph.
 What general statements can you make about collaboration in mathematics?
Datasets
The dataset consists of a list of author pairs, along with an identifier (MR number) of a paper that connects them.
Required Coursework and Skills
Participating students must have taken Math 465 (Introduction to Combinatorics), EECS 444/544 (Analysis of Societal Networks) or its equivalent and be experienced in programming in Python or other generalpurpose programming language.
Number of Participants
One undergraduate or graduate student will be chosen to work on this project.
Contact
If interested, please send your CV to Jonathan Gryak (gryakj@med.umich.edu), Senior Scientist at MIDAS.
Date Posted
11/13/2020
DATA CHALLENGES AND HACKATHONS
Participating as either teams or individuals, students use real world data sets from industry sponsors, community organizations, or University research projects to answer predefined research questions. Often run as competitions, MIDAS aims to promote student work by using judges from industry that may (and many times do) offer outstanding participants internship and permanent job opportunities.
Current Events:
Previous Events:
INDUSTRY TALKBACKS
Students learn about how data science is utilized in industry and how best to prepare for careers through conversations with realworld professionals. Sessions are either centered around a theme (interviewing, company specific job openings, etc.) or feature panelists from various fields to give a broad overview of career opportunities in data science.
Previous Events:
 A Data Scientist Plays Games
Recording: https://www.youtube.com/watch?v=WlwJdY3Wn1Q&feature=youtu.be  Careers in Data Science Panel
Recording: https://www.youtube.com/watch?v=SpHiWcJ_YL4&feature=youtu.be  Cracking the Coding Interview
Recording: https://umich.zoom.us/rec/play/78YqdeqtpjM3H9yR5QSDV6BwW9TsJqys0HUb6cNzEjmVyYCOweuNOARYrHJvSDg3WUgdDmsgPfMXcHp?continueMode=true  Quicken Loans Career Panel
Recording: https://youtu.be/T6jYeDTHF34
SKILLS WORKSHOPS
MIDAS coordinates workshops with industry professionals that center around a specific data science tool, software package, or companyspecific methodology. Participants actively engage with presenters using realworld data to gain practical experience. Workshop providers include AWS, Google, and Databricks. See the event page for more details on upcoming and past events.
UPCOMING EVENTS
December 1st, 2020  6:007:30pm