Register Now

Registration deadline: 11:59pm ET, Thursday, April 18, 2024.

Academy Overview

The Introduction to Data Science and AI Summer Academy introduces the basics of data science and AI methods to researchers (especially faculty). Our goal is to lower the barrier of entry for domain research scientists who plan to adopt data science and AI methods, and enable them to collaborate with data scientists and AI experts more effectively. 

Students are expected to bring a laptop for programming components of the workshop.

Academy Details

Outcomes

  • Skills to incorporate data science and AI methods in your research
  • Effective collaboration with data science and AI experts
  • Certification of completion

Tuition Cost

Tuition to attend: $100 for U-M internal | $3,000 for external participants (30% discount for U-M Alums)

Cancellation Policy:

  • More than 14 days before the first day: full refund minus $50 processing fee
  • Cancellation between 7 and 14 days of the first day: 50% refund
  • Less than 7 days: no refund

Who Should Attend?

This academy workshop is open to U-M researchers and trainees, and those from the public and private sector who are interested in learning about incorporating data science and AI into their research. Faculty members are particularly encouraged to attend.

Prerequisites

  • College-level math or statistics.
  • Some coding experience is recommended, but not required.

Location

TBD – In-Person on the University of Michigan’s Ann Arbor campus

Curriculum Overview

June 3 – 5, 2024, Days 1 – 3

  • Review of mathematics and statistics: probability, linear algebra, Gaussian distribution and processes, statistical inference.
  • Introduction to Python
  • Understand your data for Machine Learning
  • Introduction to Machine Learning: supervised and unsupervised methods, bias-variance tradeoff, evaluation/cross-validation.
  • Basics of Deep Learning: overparameterization;  multi-layer perceptrons; common Deep Learning architectures including Convoluted Neural Networks and Recurrent Neural Networks; Transformers and Introduction to Foundation Models.

June 6 – 7, 2024, Days 4 – 5

Attendees will select one of the following tracks.

Track 1: Experimental Design for Optimal Data Collection and Model Building

The essence of good models is good data. Not all data is equally useful, however. When data is acquired from physical, biological and engineering experiments—experiments that tend to be expensive and time-consuming—it becomes even more important to carefully consider how to carry them out. In this track, you will learn experimental design principles for quantifying the “goodness” of experiments, computing these objectives, and optimizing them through numerical algorithms.

Track 2: AI for Science: Machine Learning Applications in Physical Sciences

In this track, the Eric and Wendy Schmidt AI in Science Postdoctoral Fellows will introduce a number of Machine Learning / Deep Learning applications to address significant research questions in physical sciences, including:

A. Molecular property prediction using graph neural networks. An introduction to molecules as graph representation and discussion of graph neural networks in the context of chemical systems. This session will provide a brief theoretical background along with a mini-review on GNNs, ending with a Colab notebook exercise to demonstrate prediction of molecular properties using GNN.

B. An introduction to application of machine learning in chemical reaction prediction. An introduction to application of machine learning in chemical reaction prediction, for example: reaction properties, yields, products, reaction conditions, and synthesis routes. The hand-on portion will show one way to predict a synthesis route for a desired product using reaction templates. This will involve using a cheminformatic library such as RDKit, and building a supervised learning model.

C. Advancing Computer Vision in Electron Microscopy through Deep Learning Techniques. An introduction to the topic of advancing computer vision in electron microscopy through deep learning techniques. Additionally, this session will conduct a brief literature review of the topic, and end with a Colab notebook exercise using a particular case study.

D. Probabilistic programing, via neural network samplers, in a Bayesian paradigm. An introduction to the emission spectra from star-forming regions, to explain how the theoretical parameter space is fit to the observations.

Track 3: Bayesian inference

This track will be an introduction to the principles and practice of Bayesian inference for data analysis, with the following specific topics: prior/posterior distributions, Bayes rule, Markov Chain Monte Carlo computations, hierarchical models, model selection and model checking.

Track 4: Data science via dimension reduction; data science in Julia

No previous Julia knowledge expected.

A (Day 1). Data analysis via dimension reduction. Specifically,

  • Classical methods: PCA, MANOVA, LDA, EFA, CFA
  • Using biplots to visualize dimension reductions, plotting passive variables
  • Rotations: varimax, etc.
  • Methods for categorical data: MCA
  • Modern methods: SIR, CORE, OPG, QUADRO, sparse PCA
  • Nonlinear methods: kernel-PCA, UMAP, PCA, t-SNE
  • Examples from various disciplines with code in R, Python, Julia

B (Day 2). Data science in Julia: data management; data visualization/graphing; regression modeling with GLMs; and case studies from health, social, and natural sciences.

Instructors

Yang Chen

Assistant Professor of Statistics,
College of Literature, Science, and the Arts;
Research Assistant Professor,
MIDAS

Paramveer Dhillon

Assistant Professor of Information,
School of Information

Vital Fernández

Vital Gutierrez Fernandez

Schmidt AI in Science Fellow,
Michigan Institute for Data Science

Xun Huan

Assistant Professor of Mechanical Engineering,
College of Engineering

Kerby Shedden

Professor of Statistics and Biostatistics; Director, Center for Statistical Consultation and Research

Nanta Sophonrat

Schmidt AI in Science Fellow,
Michigan Institute for Data Science

Soumi Tribedi

Soumi Tribedi

Schmidt AI in Science Fellow,
Michigan Institute for Data Science

Anastasiia Visheratina

Schmidt AI in Science Fellow,
Michigan Institute for Data Science

Some instructors of this academy are the postdoctoral scholars in the Eric and Wendy Schmidt AI in Science postdoctoral program and the Michigan Data Science Fellows program.

Questions?

Contact Faculty Training Program Manager, Kelly Psilidis at psilidis@umich.edu