Academic Requirements
Note: we are revising academic requirements. New information will be available at the end of September, 2022.
There are three fundamental requirements for earning a Graduate Data Science Certificate.
Note: we are revising academic requirements. New information will be available at the end of September, 2022.
There are three fundamental requirements for earning a Graduate Data Science Certificate.
1. Nine graduate credit hours of coursework in approved courses: These courses are designated as core and elective classes, which are each subdivided into three categories: “Algorithms and Applications” (AA), “Data Management” (DM), and “Analysis Methods” (AM). Students are strongly encouraged to choose at least two core courses. Also, students are strongly encouraged to choose one course from each category.
Only one course may be doublecounted (up to 3 credits). It is recommended, but not required, that courses outside the main graduate program of study be selected to broaden the student’s datascience experiences (e.g., statistics students may take engineering courses, socialscience students may take outside statistics and application courses, etc.).
2. A Data Science related experience (3 credit semester equivalent, over 160 hours for work): This can take the form of noncredit activity like an internship, practicum, or professional project equivalent to a three credithour course, or additional coursework of at least three credits from the approved course list. (This course may be doublecounted with another Rackham degree program.) To satisfy this “Plus Requirement” with a datarelated experience, students will need to have their supervisor or mentor sign the verification form certifying that the student spent sufficient time working on a dataintense project during that practicum. Alternatively, if allowed and approved by the mentor, students may complete and submit to the DS Certificate Program Chair a report (26 pages) describing their experience and results, which will be evaluated to ensure the project demonstrates Data Science content, relevance and applications.
3. Regular attendance of the MIDAS Seminar Series, which brings nationally recognized data scientists to UM, is required. One semester (1credit) enrollment in EECS 409 (MIDAS Seminar) is required (could count towards the 9 didactic credits). This colloquial training will expose students to current DS developments beyond the boundaries of their own discipline. Students will be required to attend 75% of all seminars to complete the requirement (attendance will be taken).
In order to enroll in the MIDAS Data Science Certificate Program, the following prerequisites are required:
Enrollment Prerequisites:
Prerequisites  Skills  Rationale 
Completed Undergraduate Degree  Quantitative training and coding skills as described below  The DS certificate is a graduate program requiring a minimum level of quantitative skill 
Quantitative Training  Undergraduate calculus, linear algebra and introduction to probability and statistics  These are the entry level skills required for most upperlevel undergraduate and graduate courses in the program 
Coding Experience  Exposure to software development or programming on the job or in the classroom  Most DS practitioners need substantial experience with Java, C/C++, HTML5, Python, PHP, SQL/DB 
Motivation  Significant interest and motivation to pursue quantitative data analytic applications  Dedication for prolonged and sustained immersion into handson and methodological research 
In order to obtain the Data Science Certificate, moderate competency is 2 of each of the 3 competency areas below is required:
Completion Competencies:
Areas  Competency  Expectation  Notes  
Algorithms and Applications  Tools  Working knowledge of basic software tools (commandline, GUI based, or webservices)  Familiarity with statistical programming languages, e.g., R or SciKit/Python, and database querying languages, e.g., SQL or NoSQL  
Algorithms  Knowledge of core principles of scientific computing, applications programming, API’s, algorithm complexity, and data structures  Best practices for scientific and application programming, efficient implementation of matrix linear algebra and graphics, elementary notions of computational complexity, userfriendly interfaces, string matching  
Application Domain  Data analysis experience from at least one application area, either through coursework, internship, research project, etc.  Applied domain examples include: computational social sciences, health sciences, business and marketing, learning sciences, transportation sciences, engineering and physical sciences  
Data Management  Data validation & visualization  Curation, Exploratory Data Analysis (EDA) and visualization  Data provenance, validation, visualization via histograms, QQ plots, scatterplots (ggplot, Dashboard, D3.js)  
Data wrangling 
Skills for data normalization, data cleaning, data aggregation, and data harmonization/registration

Data imperfections include missing values, inconsistent string formatting (‘20160101’ vs. ‘01/01/2016’, PC/Mac/Lynux time vs. timestamps, structured vs. unstructured data  
Data infrastructure  Handling databases, webservices, Hadoop, multisource data  Data structures, SOAP protocols, ontologies, XML, JSON, streaming  
Analysis Methods  Statistical inference  Basic understanding of bias and variance, principles of (non)parametric statistical inference, and (linear) modeling  Biological variability vs. technological noise, parametric (likelihood) vs nonparametric (rank order statistics) procedures, point vs. interval estimation, hypothesis testing, regression  
Study design and diagnostics  Design of experiments, power calculations and sample sizing, strength of evidence, pvalues, False Discovery Rates  Multistage testing, variance normalizing transforms, histogram equalization, goodnessoffit tests, model overfitting, model reduction  
Machine Learning  Dimensionality reduction, knearest neighbors, random forests, AdaBoost, kernelization, SVM, ensemble methods, CNN  Empirical risk minimization. Supervised, semisupervised, and unsupervised learning. Transfer learning, active learning, reinforcement learning, multiview learning, instance learning 