U-M Data Science and AI Symposium 2021

November 15, 12:00 AM - November 16, 2021, 12:00 AM

Zoom + Michigan League

Symposium Schedule

November 15

3:45pm – Opening Remarks: Announcing MIDAS 3.0 Strategic Focus (Virtual) – View Recording

Dr. H.V. Jagadish

Director, Michigan Institute for Data and AI in Society

4:00pm – 5:00pm (Virtual) – View Recording

Keynote Address: “How machine learning can support human creators”

Dr. Rebecca Fiebrink, Reader, Creative Computing Institute | University of the Arts London

Bio: Dr. Fiebrink’s research focuses on human-computer interaction, machine learning, and signal processing all to allow people to apply machine learning to new areas such as designing new musical instruments or gestural interfaces for accessibility. She is also involved in digital humanities scholarship and machine learning education.

Mini Workshops (In-person, Michigan League)

9:00am – 11:00am

Hussey Room – Introduction to data visualization on the web with D3.js
Kalamazoo Room – Diversity and equity in data science – providing technical solutions and empowering the workforce
Michigan Room – Developing best practices for reproducible data science
Vandenberg Room – Using text as data: Introduction to machine learning for natural language processing

Organizer and Instructor:
Fred Feng, Assistant Professor, Industrial and Manufacturing Systems Engineering
UM-Dearborn

Target audience: people who want to share their research findings in plots or any other types of data visualization on the web that support interactivity. General programming knowledge would be helpful, but no prior knowledge of HTML, CSS, or JavaScript is required.

This hands-on workshop will introduce D3.js (https://d3js.org/), a popular and powerful JavaScript library for interactive data visualization for the web. The first part of the workshop will cover the basics of web standards of HTML, CSS, JavaScript, and SVG, that are needed for D3. Then we will demonstrate step by step how to make simple charts. Lastly, we will showcase some more advanced topics such as transitions and interactivity. The audience is encouraged to follow along on their own computer during the workshop.

Organizers:
Lia Corrales, Assistant Professor, Astronomy, Founder of Women of Color Coders
Jing Liu, Managing Director, MIDAS

Presenters and panelists:
Lia Corrales, Assistant Professor, Astronomy; Founder of Women of Color Coders
Tayo Fabusuyi, Assistant Research Scientist, U-M Transportation Research Institute; Leading the project “Towards a more representative Public Interest Technology (PIT) field”
H.V. Jagadish, Director, MIDAS; Professor, Computer Science and Engineering; Leading the project “Framework for Integrative Data Equity Systems”
Rada Mihalcea, Director, U-M AI Lab; Professor, Computer Science and Engineering; Developing technical solutions to detect and correct bias in data and algorithms, supporting women in technology.

Target audience: people who are involved or are interested in promoting equity and diversity both from the technical perspective and from the community perspective.

Schedule:
9 – 9:45 am: Panelist presentations. 1) Increasing diversity and inclusion in the data science and AI research community, as exemplified by Women of Color Code and Public Interest Technology. 2) Developing technical solutions to make the data we deal with, and the decisions made with these data, more equitable and inclusive.
9:45: break
9:55 – 11 am: Community forum, where participants will share their work and their thoughts on these topics, with conversation moderated by the workshop panelists.

The goal of the session is to convene like-minded people and stimulate ideas for collaborative efforts. Through this session, attendees will learn about similar activities on campus, share their ideas and activities, and get to know like-minded colleagues for collaboration. The ideal outcome of the session is a concrete plan to intensify current efforts through collaboration.

Organizer:
Jing Liu, Managing Director, MIDAS

Presenters:
Brandon Butler, Sharon Glotzer research group, Chemical Engineering: Flexible and reproducible workflows through the signac framework
Johann Gagnon Bartsch, Assistant Professor, Statistics: Building reproducible workflows in multiple platforms
Thomas Valley, Assistant Professor, Pulmonary and Critical Care Medicine: Creating a culture of code review in health care research

Target audience: Researchers who would like to learn about best practices in code review and sharing, and reproducible workflows. It is also for researchers who are interested in participating in the MIDAS Reproducibility Challenge.

The 2020 MIDAS Reproducibility Challenge highlighted important conceptual issues of reproducible data science in multiple dimensions and the creative practical approaches U-M researchers have used to address these challenges. Three winners of the Challenge will present their practical approaches in the first half of the workshop. The 2021 Challenge focuses on actionable solutions that can be shared with other researchers to improve reproducibility. In the second half of the workshop, the presenters will discuss with the audience the conceptual issues and practical solutions for making data science research transparent, traceable, and trustworthy, and answer questions for researchers who are interested in participating in the 2021 Reproducibility Challenge.

Organizers and Instructors:
Meghan Dailey, Machine Learning Specialist, Advanced Research Computing
Jule Krüger, Program Manager for Big Data and Data Science, Center for Political Studies, and Advanced Research Computing

Target audience: Anyone who is interested in the topic. A basic familiarity with Python or R is expected for the second half of the workshop.

In this workshop, we will analyze a text corpus to demonstrate the use of machine learning for natural language processing. In the first half of the workshop, we will provide a basic overview of machine learning, introduce the main concepts and logic of using text as data, and walk through a typical workflow for processing, managing and analyzing a text corpus. We will discuss how to choose between Python and R for text analysis and how to interpret the results from a topic model. In the second half of the workshop, instructors will demonstrate in two concurrent hands-on tutorials how the topic modeling example from the first half was accomplished in either Python or R. Participants who attend the first part of the workshop will walk away with a basic overview of the capabilities and methods for using text as data. Participants who attend the entire workshop will be equipped with basic programming tools to apply natural language processing in their own research. The workshop will also cover helpful resources for machine learning implementations, such as data sets, storage space, high performance computing, and consultation services at the University of Michigan.

Program Committee

Libby Hemphill, School of Information

Justin Johnson, Computer Science and Engineering

Sophia Brueckner, Stamps School of Art & Design

Jing Liu, MIDAS

Christopher Miller, Astronomy

Lilli Zhao, Biostatistics

James Walsh, MIDAS

Prasad Shankar, Radiology

Thank You to Our Sponsors

American Mathematical Society
General Dynamics
Microsoft
Quicken Loans
Yazaki