Academic Requirements

Note: New academic requirements are in effect for students who are accepted to the program after Oct., 2022. Students who have already enrolled in the program by Oct., 2022 will still follow the previous requirements.

There are three fundamental requirements for earning a Graduate Data Science Certificate.

1. Nine graduate credit hours of coursework in approved courses: These courses are designated as core and elective classes, which are each subdivided into three categories: “Algorithms and Applications” (AA), “Data Management” (DM), and “Analysis Methods” (AM). Students are required to choose at least two core courses. Also, students are required to choose one course from each category.

Only one course may be double-counted (up to 3 credits). It is recommended, but not required, that courses outside the main graduate program of study be selected to broaden the student’s data-science experiences (e.g., statistics students may take engineering courses, social-science students may take outside statistics and application courses, etc.).

2. A Data Science related experience (3 credit semester equivalent, over 160 hours for work): This can take the form of non-credit activity like an internship, practicum, or professional project equivalent to a three credit-hour course, or additional coursework of at least three credits from the approved course list. (This course may be double-counted with another Rackham degree program.) To satisfy this “Plus Requirement” with a data-related experience, students will need to have their supervisor or mentor sign the verification form certifying that the student spent sufficient time working on a data-intense project during that practicum. Alternatively, if allowed and approved by the mentor, students may complete and submit to the DS Certificate Program Chair a report (2-6 pages) describing their experience and results, which will be evaluated to ensure the project demonstrates Data Science content, relevance and applications.

3. Regular attendance of the MIDAS Seminar Series, which brings nationally recognized data scientists to U-M, is required. One semester (1-credit) enrollment in EECS 409 (MIDAS Seminar) is required* (could count towards the 9 didactic credits). This colloquial training will expose students to current DS developments beyond the boundaries of their own discipline. Students will be required to attend 75% of all seminars to complete the requirement (attendance will be taken).

*Please note that EECS 409 is being offered Winter semester of 2024, but is not expected to be offered beyond that. We will be suitably modifying this seminar requirement in 2024.

In order to enroll in the MIDAS Data Science Certificate Program, the following prerequisites are required:

Enrollment Prerequisites:

Prerequisites	Skills	Rationale
Completed Undergraduate Degree	Quantitative training and coding skills as described below	The DS certificate is a graduate program requiring a minimum level of quantitative skill
Quantitative Training	Undergraduate calculus, linear algebra and introduction to probability and statistics	These are the entry level skills required for most upper-level undergraduate and graduate courses in the program
Coding Experience	Exposure to software development or programming on the job or in the classroom	Most DS practitioners need substantial experience with Java, C/C++, HTML5, Python, PHP, SQL/DB
Motivation	Significant interest and motivation to pursue quantitative data analytic applications	Dedication for prolonged and sustained immersion into hands-on and methodological research

In order to obtain the Data Science Certificate, moderate competency is 2 of each of the 3 competency areas below is required:

Completion Competencies:

Areas	Competency	Expectation	Notes
Algorithms and Applications	Tools	Working knowledge of basic software tools (command-line, GUI based, or web-services)	Familiarity with statistical programming languages, e.g., R or SciKit/Python, and database querying languages, e.g., SQL or NoSQL
	Algorithms	Knowledge of core principles of scientific computing, applications programming, API’s, algorithm complexity, and data structures	Best practices for scientific and application programming, efficient implementation of matrix linear algebra and graphics, elementary notions of computational complexity, user-friendly interfaces, string matching
	Application Domain	Data analysis experience from at least one application area, either through coursework, internship, research project, etc.	Applied domain examples include: computational social sciences, health sciences, business and marketing, learning sciences, transportation sciences, engineering and physical sciences
Data Management	Data validation & visualization	Curation, Exploratory Data Analysis (EDA) and visualization	Data provenance, validation, visualization via histograms, Q-Q plots, scatterplots (ggplot, Dashboard, D3.js)
	Data wrangling	Skills for data normalization, data cleaning, data aggregation, and data harmonization/registration	Data imperfections include missing values, inconsistent string formatting (‘2016-01-01’ vs. ‘01/01/2016’, PC/Mac/Lynux time vs. timestamps, structured vs. unstructured data
	Data infrastructure	Handling databases, web-services, Hadoop, multi-source data	Data structures, SOAP protocols, ontologies, XML, JSON, streaming
Analysis Methods	Statistical inference	Basic understanding of bias and variance, principles of (non)parametric statistical inference, and (linear) modeling	Biological variability vs. technological noise, parametric (likelihood) vs non-parametric (rank order statistics) procedures, point vs. interval estimation, hypothesis testing, regression
	Study design and diagnostics	Design of experiments, power calculations and sample sizing, strength of evidence, p-values, False Discovery Rates	Multistage testing, variance normalizing transforms, histogram equalization, goodness-of-fit tests, model overfitting, model reduction
	Machine Learning	Dimensionality reduction, k-nearest neighbors, random forests, AdaBoost, kernelization, SVM, ensemble methods, CNN	Empirical risk minimization. Supervised, semi-supervised, and unsupervised learning. Transfer learning, active learning, reinforcement learning, multiview learning, instance learning

Graduate Data Science Certificate Program

Academic Requirements

DS Certificate Page