Symposium Schedule

November 15

3:45pm – Opening Remarks: Announcing MIDAS 3.0 Strategic Focus (Virtual) – View Recording

Dr. H.V. Jagadish

Director, Michigan Institute for Data Science

4:00pm – 5:00pm (Virtual) – View Recording

Keynote Address: “How machine learning can support human creators”

Dr. Rebecca Fiebrink
Reader, Creative Computing Institute | University of the Arts London

Dr. Fiebrink’s research focuses on human-computer interaction, machine learning, and signal processing all to allow people to apply machine learning to new areas such as designing new musical instruments or gestural interfaces for accessibility. She is also involved in digital humanities scholarship and machine learning education.

November 16

Mini Workshops (In-person, Michigan League)

9:00am – 11:00am

Hussey Room – Introduction to data visualization on the web with D3.js
Kalamazoo Room – Diversity and equity in data science – providing technical solutions and empowering the workforce
Michigan Room – Developing best practices for reproducible data science
Vandenberg Room – Using text as data: Introduction to machine learning for natural language processing

Introduction to data visualization on the web with D3.js

Organizer and Instructor:
Fred Feng, Assistant Professor, Industrial and Manufacturing Systems Engineering
UM-Dearborn

Target audience: people who want to share their research findings in plots or any other types of data visualization on the web that support interactivity. General programming knowledge would be helpful, but no prior knowledge of HTML, CSS, or JavaScript is required.

This hands-on workshop will introduce D3.js (https://d3js.org/), a popular and powerful JavaScript library for interactive data visualization for the web. The first part of the workshop will cover the basics of web standards of HTML, CSS, JavaScript, and SVG, that are needed for D3. Then we will demonstrate step by step how to make simple charts. Lastly, we will showcase some more advanced topics such as transitions and interactivity. The audience is encouraged to follow along on their own computer during the workshop.

Diversity and equity in data science - providing technical solutions and empowering the workforce

Organizers:
Lia Corrales, Assistant Professor, Astronomy, Founder of Women of Color Coders
Jing Liu, Managing Director, MIDAS

Presenters and panelists:
Lia Corrales, Assistant Professor, Astronomy; Founder of Women of Color Coders
Tayo Fabusuyi, Assistant Research Scientist, U-M Transportation Research Institute; Leading the project “Towards a more representative Public Interest Technology (PIT) field”
H.V. Jagadish, Director, MIDAS; Professor, Computer Science and Engineering; Leading the project “Framework for Integrative Data Equity Systems”
Rada Mihalcea, Director, U-M AI Lab; Professor, Computer Science and Engineering; Developing technical solutions to detect and correct bias in data and algorithms, supporting women in technology.

Target audience: people who are involved or are interested in promoting equity and diversity both from the technical perspective and from the community perspective. 

Schedule:
9 – 9:45 am: Panelist presentations. 1) Increasing diversity and inclusion in the data science and AI research community, as exemplified by Women of Color Code and Public Interest Technology. 2) Developing technical solutions to make the data we deal with, and the decisions made with these data, more equitable and inclusive.
9:45: break
9:55 – 11 am: Community forum, where participants will share their work and their thoughts on these topics, with conversation moderated by the workshop panelists.

The goal of the session is to convene like-minded people and stimulate ideas for collaborative efforts. Through this session, attendees will learn about similar activities on campus, share their ideas and activities, and get to know like-minded colleagues for collaboration.  The ideal outcome of the session is a concrete plan to intensify current efforts through collaboration.

Developing best practices for reproducible data science

Organizer:
Jing Liu, Managing Director, MIDAS

Presenters:
Brandon Butler, 
Sharon Glotzer research groupChemical Engineering: Flexible and reproducible workflows through the signac framework
Johann Gagnon Bartsch, Assistant Professor, Statistics: Building reproducible workflows in multiple platforms
Thomas Valley, Assistant Professor, Pulmonary and Critical Care Medicine: Creating a culture of code review in health care research

Target audience: Researchers who would like to learn about best practices in code review and sharing, and reproducible workflows. It is also for researchers who are interested in participating in the MIDAS Reproducibility Challenge. 

The 2020 MIDAS Reproducibility Challenge highlighted important conceptual issues of reproducible data science in multiple dimensions and the creative practical approaches U-M researchers have used to address these challenges. Three winners of the Challenge will present their practical approaches in the first half of the workshop. The 2021 Challenge focuses on actionable solutions that can be shared with other researchers to improve reproducibility. In the second half of the workshop, the presenters will discuss with the audience the conceptual issues and practical solutions for making data science research transparent, traceable, and trustworthy, and answer questions for researchers who are interested in participating in the 2021 Reproducibility Challenge.

Using text as data: Introduction to machine learning for natural language processing

Organizers and Instructors:
Meghan Dailey, Machine Learning Specialist, Advanced Research Computing
Jule Krüger, Program Manager for Big Data and Data Science, Center for Political Studies, and Advanced Research Computing

Target audience: Anyone who is interested in the topic. A basic familiarity with Python or R is expected for the second half of the workshop. 

In this workshop, we will analyze a text corpus to demonstrate the use of machine learning for natural language processing. In the first half of the workshop, we will provide a basic overview of machine learning, introduce the main concepts and logic of using text as data, and walk through a typical workflow for processing, managing and analyzing a text corpus. We will discuss how to choose between Python and R for text analysis and how to interpret the results from a topic model. In the second half of the workshop, instructors will demonstrate in two concurrent hands-on tutorials how the topic modelling example from the first half was accomplished in either Python or R. Participants who attend the first part of the workshop will walk away with a basic overview of the capabilities and methods for using text as data. Participants who attend the entire workshop will be equipped with basic programming tools to apply natural language processing in their own research. The workshop will also cover helpful resources for machine learning implementations, such as data sets, storage space, high performance computing, and consultation services at the University of Michigan.

Program Committee

Libby Hemphill
School of Information

Justin Johnson
Computer Science and Engineering

Sophia Brueckner
Stamps School of Art & Design

Jing Liu
MIDAS

Christopher Miller

Christopher Miller
Astronomy

Lili Zhao

Lilli Zhao
Biostatistics

Prasad Shankar
Radiology

James Walsh
MIDAS

Thank You to Our Sponsors

American Mathematical Society

General Dynamics

Microsoft

Quicken Loans

Yazaki