Priority application deadline: 11:59 PM ET, Thursday, August 8, 2024
Academy Overview
The MIDAS-ICPSR Social Data Science Summer Academy is a week-long hybrid program hosted collaboratively by the Michigan Institute for Data & AI in Society (MIDAS) and the Inter-university Consortium for Political and Social Research (ICPSR). Designed for both ICPSR members and University of Michigan researchers, this intensive academy will provide a thorough and practical overview of essential data science techniques and how they can be applied to advance social science research.
Led by globally leading experts in data science and social science, sessions will be offered simultaneously virtually by Zoom and in-person at the University of Michigan campus. Through lectures, hands-on workshops, example applications, and group discussions, participants will develop core competencies in areas such as machine learning, statistical modeling, data visualization, textual analysis, and more.
Additionally, instructors will highlight cutting-edge methods like deep learning and neural networks tailored to social science research questions and data. Specific applications covered will align with research interests of participants, exploring relevant case studies and projects across the social sciences and beyond, in domains like political science, sociology, public health and epidemiology, communications, economics, and more.
Academy Details
- Determine which data science / artificial intelligence techniques are appropriate for a given clinical application and apply them to their own clinical and/or research activities
- Develop strategies for integrating data science into their grant applications, work effectively with data scientists, and build new collaborations
- Utilize data science solutions and apply them to biomedical problems
- Apply a breadth of data science topics with data science experts as collaborators
- Skills to abstractly consider data science solutions and apply them to biomedical problems
- Receive a certificate of completion
We will send payment instructions to accepted applicants.
- $3,000 for external participants (30% discount for U-M Alums)
- Thanks to support from the University, we are able to offer a reduced price of $100 for U-M personnel and students
This academy workshop is open to all U-M and external biomedical scientists, but the content is geared towards junior faculty members and those from the public and private sector who are interested in learning about incorporating data science into their research.
- More than 14 days before the first day: full refund minus $50 processing fee
- Cancellation between 7 and 14 days of the first day: 50% refund
- Less than 7 days: no refund
College-level math or statistics. No previous coding experience is required.
Central Campus Classroom Building (CCCB) Room 3460
1225 Geddes Ave.
Ann Arbor, MI 48109
Parking available nearby includes a parking structure for U-M Blue/Gold permit holders, located at 525 Church St., and metered street parking along Church St. There is also a public garage at 650 S. Forest Ave. View available public parking in Ann Arbor here and real time occupancy counts and public parking structures here.
Curriculum Overview
Monday and Tuesday: We will introduce foundational concepts and mathematical intuition for machine learning, and an introduction to Python for data analysis. Then, we’ll introduce the data science research workflow and use examples in social science research to put data science ideas and practices together into a workflow.
Wednesday: We will introduce statistical and machine learning tools that enable the systematic analysis of text, focusing on natural language processing and Large Language Models. The session will combine traditional instruction with practical coding exercises during in-class sessions.
Thursday: We will focus on image data in social science research with lectures and coding sessions. Specific topics include: 1) Image basics: pixels, features, and image structure. 2) From text to images: similarities and differences. 3) Classifying images. 4) Big of Visual Words. 5) Further research, resources and challenges.
Friday: Network analysis is a foundational methodology in social data science, with important advances occurring each year, and in many disciplines. We will introduce modeling and prediction with network data, focusing on: 1) Statistical models for networks; 2) Latent variables and measurement with network data; 3) Link prediction; 4) Graph machine learning. The day will include lectures, discussion of published application examples, and tutorials that involve open source software and real-world data.
Instructors
Bruce Desmarais
DeGrandis-McCourtney Early Career Professor in Political Science, Director of the Center for Social Data Analytics, and an Affiliate of the Institute for Computational and Data Sciences at Penn State University.
Edgar Franco Vivanco
Assistant Professor of Political Science, College of Literature, Science, and the Arts and Faculty Associate, Center for Political Studies, Institute for Social Research
Elle O’Brien
Lecturer III in Information and Research Investigator, School of Information
Michelle Torres
Assistant Professor in the Department of Political Science at University of California, Los Angeles
Academy Schedule
Time: 8:30am – 5:30pm EST
*Please note room locations may vary*
Instructor: Elle O’Brien
Teaching Assistant: Vidhi Bhatt
Location: Central Campus Classroom Building (CCCB), Room 2460
Elle will cover foundational concepts and mathematical intuition for machine learning, and an introduction to Python for data analysis. Then, we’ll look to some applied examples in social science research to put these ideas and practices together into a data science workflow.
- Session 1: Welcome to Python
- Session 2: Thinking Programmatically
- Session 3: Machine Learning Part I
Instructor: Elle O’Brien
Teaching Assistant: Vidhi Bhatt
Location: Central Campus Classroom Building (CCCB), Room 2460
12 PM: Lunch Reception Provided
- Session 1: Python Review
- Session 2: Getting Data Ready for Analysis in Python
- Session 3: Machine Learning Part II
Instructor: Edgar Franco-Vivanco
Teaching Assistant: Vidhi Bhatt
Location: Central Campus Classroom Building (CCCB), Room 0460
Text serves as a fundamental medium of human communication. From ancient inscriptions to contemporary social media, societies have expressed their thoughts, norms, behavioral guidelines, and prevailing biases through written language. However, it is only in recent times that social scientists have gained the ability to analyze large quantities of text using algorithms originally developed in computer science. This course is designed to introduce statistical and machine learning tools that enable the systematic analysis of text. By the end of this course, students will be well-equipped to understand the basic elements of text analysis methods. The class will adopt a semi-flipped classroom format, combining traditional instruction with practical coding exercises during in-class sessions.
- Session 1: From Text to Data
- Session 2: Description
- Session 3: Discovery (Unsupervised learning)
Instructor: Michelle Torres
Teaching Assistant: Anthony Carreon
Location: Central Campus Classroon Building (CCCB), Room 0460
Political science has changed dramatically in the last decade. Until recently, it has never been easier to retrieve massive amounts of data with different content and format.
Technological advances have not only allowed researchers, especially in the computer science field, to access textual and visual data, key components of the political communication process, but also to develop and use methods that were unthinkable a few years ago. While these data and methods have the potential to revolutionize the study of politics by allowing the emergence of new questions and the possibility of new ways of understanding and explaining political puzzles, the traditional social science toolkit does not suffice for their analysis. Therefore, in this class we introduce the application of computational tools to the study of questions relevant to social scientists, especially with regards to image analysis.
As students can probably imagine, a lot of these tools rely on the power of computers and machines to achieve their objectives. Thus, the workshop features a block in which we will code and program in both Python and R. Knowledge of Python is NOT necessary. We will cover the code necessary to perform certain tools from a functional rather than technical perspective. However, basic knowledge of R is required.
- Session 1: Motivation
- Session 2: Image Basics – Pixels, Features, and Image Structure
- Session 3: From Text to Images
- Session 4: Classifying Images
- Session 5: Unsupervised Approach
- Session 6: Practical Challenges and Implementation
- Session 7: Coding Session
Instructor: Bruce Desmarais
Teaching Assistant: Ken Reid
Location: Central Campus Classroom Building (CCCB), Room 0460
Network analysis is a foundational methodology in social data science, with important advances occurring each year, and in many disciplines. In this course we will cover advanced approaches to modeling and prediction with network data. Our major areas of focus will include statistical modeling; measurement; and machine learning and prediction. We will spend approximately two hours on each of these topics. Each topic module will include a conceptual lecture, discussion of published application examples, and tutorials that involve open source software and real-world data. Below is a listing of the main topics we will cover.
- Session 1: Introduction to Networks
- Session 2: Statistical Models for Networks
- Session 3: Latent Variable Measurement with Networks
- Session 4: Hands-on Programming Tutorial I
- Session 5: Link Prediction
- Session 6: Graph Machine Learning
- Session 7: Hands-on Programming Tutorial II
Co-Organizers
Questions? Contact Us.
Contact Faculty Training Program Manager, Kelly Psilidis at psilidis@umich.edu