Mini-workshop topics:
- Agent-based modeling and systemic racism
- Data Science and Natural Language Processing to find rare classes of entities from text
- Introduction to Python for community members and K-12 teachers and students
- Scrubbing and cleaning of sensitive data
- Stitching Together the Fabric of 21st Century Social Science
- The state of the art in Automated and Semi-Automated Video Coding
Agent-based modeling and systemic racism
Lead Presenter: Holly Hartman, PhD candidate, Biostatistics, University of Michigan
In this workshop, participants will gain a better understanding of systemic bias and how algorithms may continue to promote inequity. Participants will learn about agent based methods, a tool which can be used to examine algorithmic fairness. There will be opportunities to brainstorm ideas for new research projects within the participants’ fields.
Data Science and Natural Language Processing to find rare classes of entities from text
Lead Presenter: VG Vinod Vydiswaran, Assistant Professor, Learning Health Sciences and School of Information, University of Michigan
Natural language processing (NLP) and Data Science methods, including recently popular deep learning-based approaches, can unlock information from narrative text and have received great attention in the medical domain. Many NLP methods have been developed and showed promising results in various information extraction tasks, especially for rare classes of named entities. These methods have also been successfully applied to facilitate clinical research. In this workshop, we will highlight some methods and technologies to identify rare concepts and entities in text in the medical domain as well as other “open” domains.
Introduction to Python for community members and K-12 teachers and students
Lead Presenter: Fred Feng, Assistant Professor Industrial and Manufacturing Systems Engineering, University of Michigan-Dearborn
This hands-on workshop is tailored to audiences who do not have prior programming experience. The first half of the workshop covers Python programming basics and the second half covers performing data analysis and visualization in Python with real-world data. The audiences are encouraged to follow along with the examples on their own computer. We will use an online browser-based environment (Google Colab), and no software installations on your computer are required. Attendees will need a Google account and will sign in to their browser in order to use this cloud-based tool during the workshop.
Scrubbing and cleaning of sensitive data
Lead Presenter: Jonathan Reader, Programmer/Data Analyst, Neurology, University of Michigan
Co-Presenters:
Nicolas May, Data Systems Manager, Neurology, University of Michigan
Kelly Bakulski, Research Assistant Professor, School of Public Health, University of Michigan
Before analysis, data must be retrieved, scrubbed of identifiable information, cleaned (e.g., addressed missing data, reshaped appropriately), and delivered. Using biomedical and transportation datasets as examples of how this generalizable process works, this workshop will walk attendees through a real-world pipeline used to process and deliver datasets. Documentation and code will be made available through GitLab to allow for coding along with the demonstration. As a result of this workshop, attendees will leave with a practical template for implementing their own a data science pipeline.
Stitching Together the Fabric of 21st Century Social Science
Presentations:
Mike Mueller-Smith, Assistant Professor, Department of Economics, University of Michigan: “The Criminal Justice Administrative Records System: Assessing the Footprint of the U.S. Criminal Justice System”
David Johnson, Director and Research Professor, Panel Study of Income Dynamics and Survey Research Center, University of Michigan: “Building America’s Family Tree: The Panel Study of Income Dynamics”
Trent Alexander, Associate Director and Research Professor, ICPSR, University of Michigan: “Creating a New Census-based Longitudinal Infrastructure”
Joelle Abramowitz, Assistant Research Scientist, Survey Research Center, University of Michigan: “The Census-Enhanced Health and Retirement Study: Optimal Probabilistic Record Linkage for Linking Employers in Survey and Administrative Data”
Today’s pressing questions of social science and public policy demand an unprecedented degree of data scope and integration as we recognize the cross-cutting dynamics of economics, political science, sociology, demography, and psychology. This panel features four UM researchers who are pushing the frontier of data construction and linkage in coordination with partners at the U.S. Census Bureau.
The state of the art in Automated and Semi-Automated Video Coding
Lead Presenter: Jason Corso, Professor, Electrical Engineering and Computer Science, University of Michigan
Co-Presenters:
Maggie Levenstein, Director and Research Professor, ICPSR and School of Information, University of Michigan
Susan Jekielek, Assistant Research Scientist, ICPSR, University of Michigan
Donald Likosky, Professor, Department of Cardiac Surgery, University of Michigan
Video is being acquired at an alarming rate across domains, including social research, healthcare, entertainment, sporting and more. The ability to code this video—attribute certain properties, labels, and other annotations—in support of analytical domain-relevant questions is critical; otherwise, human coding is required. Human coding, however, is laborious, expensive, not repeatable, and, worse, often error prone. Video coding, an area within artificial intelligence and computer vision, seeks automated and semi-automated methods to support more effective and robust video coding. This workshop will review the state of the art in video coding from a capabilities, limitations and tooling perspective and present real-world use-cases.