Approved Courses

Students must complete 9 credit hours of approved courses to earn the Graduate Data Science Certificate.

Upon request, other courses may be allowed based on course availability, program demands and student needs.

Core courses

Additional data science methods and techniques courses

Domain-specific data science application courses

Core Data Science Courses:

Legend: AA=Algorithms and Applications, DM=Data Management, AM=Analysis Methods.

Course Number Title Description Type
EECS 584: Advanced Database Management Systems Masters/Ph.D. level course for students in Computer Science, Electrical Engineering, and Information School DM
EECS 545: Machine Learning Foundations of machine learning, mathematical derivation and implementation of the algorithms, and their applications DM/AM
EECS 453: Applied Data Analysis Applied matrix algorithms for signal processing, data analysis and machine learning DM/AA
Math 571: Numerical Linear Algebra Numerical methods for solving linear algebra problems (linear systems and eigenvalue problems), matrix decompositions, and convex optimization AM
Stats 415: Data Mining and Statistical Learning This course covers the principles of data mining, exploratory analysis and visualization of complex data sets, and predictive modeling. The presentation balances statistical concepts (such as over-fitting data, and interpreting results) and computational issues. Students are exposed to algorithms, computations, and hands-on data analysis in the weekly discussion sessions. AM
Stats 503: Applied Multivariate Analysis The course covers methods for modern multivariate data analysis and statistical learning, including theoretical foundations and practical applications.   Topics include principal component analysis and other dimension reduction techniques, classification (discriminant analysis, decision trees, nearest neighbor classifiers, logistic regression, support vector machines, ensemble methods), clustering (agglomerative and partitioning methods, model-based methods), and categorical data analysis. There is a significant data analysis component. AM/AA
HS 650: Data Science and Predictive Analytics The course aims to build computational abilities, inferential thinking, and practical skills for tackling core data scientific challenges. It explores foundational concepts in data management, processing, statistical computing, and dynamic visualization using modern programming tools and agile web-services. Concepts, ideas, and protocols are illustrated through examples of real observational, simulated and research-derived datasets. Some prior quantitative experience in programming, calculus, statistics, mathematical models, or linear algebra will be necessary. AA/AM/DM
HS 853: Scientific Methods for Health Sciences: Special Topics Modern analytical methods for advanced healthcare research. Specific focus on innovative modeling, computational, analytic and visualization techniques to address concrete driving biomedical and healthcare applications. The course covers the 5 dimensions of Big-Data (volume, complexity, time/scale, source and management). AM/AA
BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics This course focuses on machine learning methods and their applications in biomedical sciences. Topics include:

1) data management solutions for Big Data applications.

2) feature extraction and reduction methods.

3) clustering and classification methods.

4) testing and validation of models.

5) applications in systems biology and clinical informatics.

Biomedical Signal and Image Analysis
The course covers some fundamental methods in biomedical data analysis. Topics include:

  1. Database management in biomedical applications.
  2. Transforms and feature extraction, Fourier transform, wavelet transform, fundamentals of information theory, and statistical methods used in signal processing.
  3. Image enhancement, image segmentation, and image feature extraction methods.
  4. Brief introduction to natural language processing.
  5. Introduction to fundamental techniques in clustering and classification.
  6. Applications in medicine and biology.
SI 608: Networks: Theory and Application Advanced course for master students of information and health informatics  AM/DM
SI 650/EECS 549: Information Retrieval Advanced course for graduate students of information, health informatics, and CS  DM/AA
Biostats 646: High Throughput Molecular Genetic and Epigenetic Data Analysis Data analysis in experimental molecular biology, gene expression, genome sequence and epigenomics data  AM/AA
TO 640: Big Data Management: Tools and Techniques This course teaches the basic tools in acquisition, management, and visualization of large data sets. Students will learn how to: store, manage, and query databases via SQL; quickly construct insightful visualizations of multi-attribute data using Tableau; use the Python programming language to manage data as well as connect to APIs to efficiently acquire public data. After taking this course, students will be able to efficiently construct large data sets that source underlying data from multiple sources, and form initial hypotheses based on visualization.  AA/DM
Psych 613: Advanced Statistical Methods I This is the first course in a two-semester sequence on data analysis. This course presents the “general linear model” with particular emphasis on exploratory data analysis, contrast analysis, residual analysis, and Euclidean distance. The topics covered over the two semesters include analysis of variance, regression, categorical data analysis, principal components analysis, multidimensional scaling, cluster analysis, multivariate ANOVA, canonical correlation, and structural equations modelling.  AM/AA
Psych 614: Advanced Statistical Methods II Topics covered in this course include multidimensional scaling, cluster analysis, principal components, factor analysis, multivariate analysis of variance and canonical correlation. A brief introduction to reliability theory, structural equations modeling and hierarchical linear modeling will also be provided  AM/AA
LHS 610 Learning from Health Data: Applied Data Science in Health
  • Rigid vs. flexible models
  • Decision trees
  • K-nearest neighbors
  • Perceptron
  • Support vector machines
  • Introduction to caret: training and predicting
Biostat 521: Applied Biostatistics I Fundamental statistical concepts related to the practice of public health: descriptive statistics; probability; sampling; statistical distributions; estimation; hypothesis testing; chi-square tests; simple and multiple linear regression; one-way ANOVA. . Taught at a more advanced mathematical level than Biostat 503. Use of the computer in statistical analysis.  AM/AA
Biostat 522: Applied Biostatistics II A second course in applied biostatistical methods and data analysis. Concepts of data analysis and experimental design for health-related studies. Emphasis on categorical data analysis, multiple regression, analysis of variance and covariance.  AM/DM
Biostat 650: Applied Statistics I: Linear Regression Graphical methods, simple and multiple linear regression; simple, partial and multiple correlation; estimation; hypothesis testing, model building and diagnosis; introduction to nonparametric regression; introduction to smoothing methods (e.g., lowess) The course will include applications to real data.  AM/AA

Additional Data Science Methods and Techniques Courses

AA=Algorithms and Applications, DM=Data Management, AM=Analysis Methods
Course Number Title Description Offering AA/AM /DM
HS 851: Scientific Methods for Health Sciences: Applied Inference Applied inference methods in studies involving multiple variables. Specific methods that will be discussed include linear regression, analysis of variance, and different regression models. This course emphasizes the scientific formulation, analytical modeling, computational tools and applied statistical inference in diverse health-sciences problems. Hands-on data interrogation, modeling approaches, rigorous interpretation and inference. Fall


BIOINF 527: Introduction to Bioinformatics and Computational Biology Intensive MS/PhD course for Bioinformatics students on Big Data projects Winter


BIOINF 699: Biomedical Signal and Image Processing Masters/Ph.D. level course for students in Computational Medicine and Bioinformatics, and Engineering Fall

alt. years

BIOINF 699: Machine Learning for Systems Biology and Clinical Informatics Masters/Ph.D. level course for students in Computational Medicine and Bioinformatics, and Engineering Fall

alt. years

EECS 484: Database Management Systems Undergraduate level course for Computer Science and Engineering students Winter


EECS  551: Matrix Methods for Signal Processing, Data Analysis & Machine Learning Theory and application of matrix methods to signal processing, data analysis and machine learning. Theoretical topics include subspaces, eigenvalue and singular value decomposition, projection theorem, constrained, regularized and unconstrained least squares techniques and iterative algorithms. Applications such as image deblurring, ranking of webpages, image segmentation and compression, social networks, circuit analysis, recommender systems and handwritten digit recognition. Note: EECS 453 and EECS 551 share same lectures but have different recitations.  AA/AM
EECS 498 Advanced Signal Processing & Applications  AA
EECS453: Applied Data Analysis Applied matrix algorithms for signal processing, data analysis and machine learning  AA/AM
EECS 684: Current Topics in Databases Masters/Ph.D. level course for students in Computer Science and Electrical Engineering Winter


EECS 445/545: Introduction to Machine Learning Programming-focused introduction to Machine Learning Annually AA/AM
EECS 598: Unsupervised Feature Learning Principles and progress in unsupervised feature learning algorithms for machine learning applications. Topics include clustering, sparse coding, autoencoders, restricted Boltzmann machines, and deep belief networks Biannually AA/AM/DM
EECS 505: Computational Data Science The Computational Data Science course offers an in-depth introduction to computational methods in data science for identifying, fitting, extracting and making sense of patterns in large data sets. More information is available. Biannually AA/AM/DM
EECS 598-007: Practical Machine Learning Basics of practical machine learning and data mining while focusing on real-world applications. ML in the fields of sports analytics, data-driven medicine, finance, and personalized education Annually AA/AM/DM
EECS 598-006: Probabilistic Analysis of Large Scale Systems Emerging topics in epidemics and diffusions, queueing systems, analysis of randomized algorithms, Bayesian information cascades, network analysis and random graphs. Annually AA/DM
EECS 598-005: Theoretical Machine Learning Understanding the mathematical underpinnings of machine learning. Fall AA/AM
EPID 633: Introduction to Mathematical Modeling Introduction To Mathematical Modeling In Epidemiology And Public Health Fall AA
IOE 511/MATH 562: Continuous Optimization Methods Introduction to concepts and methods of constrained and unconstrained continuous nonlinear optimization.  The course revolves around three issues in optimization:  building optimization models of problems,  characterization of their solutions, and algorithms for finding these solutions.  As the semester progresses, I will compile a list of topics of each lecture on the web site.  The outline of the topics cover: Introduction to optimization, optimality conditions for unconstrained problems, algorithms  for  unconstrained  problems  (steepest  descent,  Newton’s,  etc.)   and  analysis  of their convergence, optimality conditions and constraint qualifications for constrained problems, convexity and its role in optimization, algorithms for constrained problems (SQP, barrier and penalty methods, etc.), conic optimization problems, their applications, and methods for their solution. Winter AA/AM
LHS 712: Natural Language Processing for Health Students in this course will learn advanced techniques to parse and collate information from text-rich health documents such as electronic health records, clinical notes, and peer-reviewed medical literature. In this elective, students will be able to delve deeper into challenges in recognizing medical entities in text documents, extracting clinical information, addressing ambiguity and polysemy, and building searchable interfaces to efficiently and effectively query and retrieve relevant patient data. Students will develop tools and techniques to analyze new genres of health information, and build resources to help in these tasks. Students will also participate in a semester-long project on addressing specific natural language processing challenges in real-life health data sets. Winter AA/AM/DM
SI 671/SI721: Data Mining – Methods and Applications Advanced masters level course /doctoral course for students in information sciences Fall


SI 649: Information visualization  Image models, multidimensional and multivariate data, design principles for visualization, hierarchical, network, textual and collaborative visualization, and visualization pipeline, data processing for visualization, visual representations, visualization system interaction design, and impact of perception On Demand DM
Math 651: Topics in Applied Mathematics Sparse analysis, compressive sensing and data modeling Winter


Math 471: Introduction to Numerical Methods Numerical methods for solving practical scientific problems involving accuracy, stability, efficiency and convergence Fall


Stats 500: Statistical Learning I – Regression Linear Regression Models: denition, fitting, Gauss-Markov theorem, inference, interpretation of results, meaning of regression coefficients, diagnostics, influential observations, multi-collinearity, lack of t, robust procedures, transformations, variable selection, ridge regression, principal components regression, ANOVA and analysis of covariance. Introduction to generalized linear models: general framework, binomial data, logistic regression, Poisson regression. The objective is to learn what methods are available and, more importantly, when they should be applied. Winter AM
Stats 605: Advanced Topics in Modeling and Data Analysis Topics include: (1) classification and machine learning, including support vector machines, recursive partitioning, and ensemble methods; (2) methods for analyzing sets of curves, surfaces and images, including functional data analysis, wavelets, independent component analysis, and random field models; (3) modern regression, including splines and generalized additive models, (4) methods for analyzing structured dependent data, including mixed effects models, hierarchical models, graphical models, and Bayesian networks; and (5) clustering, detection, and dimension reduction methods, including manifold learning, spectral clustering, and bump hunting Annually AA/DM/AM
Biostats 696: Spatial Statistics Theory and methods of spatial and spatio-temporal statistics, modeling and inference on spatial processes within a geostatistical and a hierarchical Bayesian framework Annually AM
Biostats 615: Statistical Computing Practical understanding of computational aspects in implementing statistical methods. Annually AA/DM/AM
TO 628: Advanced Big Data Analytics   With the ongoing explosion in availability of large and complex business datasets (“Big Data”), Machine Learning (“ML”) algorithms are increasingly being used to automate the analytics process and better manage the volume, velocity and variety of Big Data. This course teaches how to apply the growing body of ML algorithms to various Big Data sources in a business context. AA/DM/AM
BME 499: Artificial Intelligence in Biomedical Engineering (AI in BME) The course will introduce students to biomedical applications of Machine Learning algorithms. It will lay the foundation for analysis of any big biomedical data set. This course will provide an overview of a wide range of AI and machine-learning tools, biomedical data sets (imaging, omics, health records) and diseases (cancer, cardiovascular-, infectious- and brain diseases). AA

Domain-Specific Data Science Application Courses

AA=Algorithms and Applications, DM=Data Management, AM=Analysis Methods
Course Number Title Description Offering AA/AM /DM

ChE 696 / MSE 593: Applied Data Science for Engineers

This graduate level prepares engineering students to use data science tools during their master’s and PhD thesis research as well as for post-graduation in industry, government, and academia. This course will familiarize students with the principles of modern data science techniques in the context of chemical engineering, materials science, and research. Central focus is on data science tools used in engineering and science applications such as, data curation, supervised and unsupervised machine learning, and data mining. Algorithms and frameworks covered include the perceptron, dimensionality reduction tools, kernel ridge regression, neural networks, subgroup discovery, compressed sensing, random forests, support vector machines, and causal inference, among others. Homework exercises include hands-on practice of using data science to solve science and engineering problems. Students will be responsible for a data science project on a topic of interest. Winter AA
Physics 514: Computational Physics Computational Physics graduate seminar, including an Introduction to Python mini-course at the start of the semester Annually AA
SI 506: Programming I Introduction to programming with a focus on applications informatics. Covers the fundamental elements of a modern programming language and how to access data on the internet. Explores how humans and technology complement one another, including techniques used to coordinate groups of people working together on software development. View syllabus Winter


SI 671/SI721: Data Mining – Methods and Applications Advanced masters level course /doctoral course for students in information sciences Fall


SI 649: Information visualization  Image models, multidimensional and multivariate data, design principles for visualization, hierarchical, network, textual and collaborative visualization, and visualization pipeline, data processing for visualization, visual representations, visualization system interaction design, and impact of perception On Demand DM
Complex Systems 535: Network theory  Social networks, the world wide web, information and biological networks; methods and computer algorithms for the analysis and interpretation of network data; graph theory; models of networks including random graphs and preferential attachment models; spectral methods and random matrix theory; maximum likelihood methods; percolation theory; network search On Demand AA
SI 601: Data manipulation  Data harvesting, processing, aggregation, design and evaluation of analytic solutions, multisource data, automated gathering of data, parsing, summarization, and exploratory data analytics (use of Python modules) On Demand DM
SI 618: Exploratory data analysis  Converting messy data (using R), connecting to cloud databases, computation and visualization, summary statistics, ‘grammar of graphics’, graphical aesthetics, data stratification, factor analysis of categorical data On Demand AM
Coursera: Social network analysis  Social network analysis, theoretical models and computational tools, interpretation of social and information networks. On Demand (MOOC) AA/AM/DM
SurvMeth 625: Applied Sampling Methods of Survey Sampling is a moderately advanced course in applied statistics, with an emphasis on the practical problems of sample design, which provides students with an understanding of principles and practice in skills required to select subjects and analyze sample data. Topics covered include stratified, clustered, systematic multi-stage sample designs; unequal probabilities and probabilities proportional to size, area, and telephone sampling; ratio means; sampling errors; frame problems; cost factors; and practical designs and procedures. Annually, Winter AA/DM/AM
SurvMeth 627: Fundamentals of Computing and Data Display The first part of this course provides an introduction to web scraping and APIs for gathering data from the web and then discusses how to store and manage (big) data from diverse sources efficiently. The second part of the course demonstrates techniques for exploring and finding patterns in (non-standard) data, with a focus on data visualization. Tools for reproducible research will be introduced to facilitate transparent and collaborative programming. The course focuses on R as the primary computing environment, with excursus into SQL and Big Data processing tools. Annually, Fall AA/DM/AM
SurvMeth 685: Statistical Methods I This is the first in a two term sequence in applied statistical methods covering topics such as regression, analysis of variance, categorical data, and survival analysis. Annually, Fall AA/DM/AM
SurvMeth 686: Statistical Methods II This builds on the introduction to linear models and data analysis provided in Statistical Methods I. Topics include: Multivariate analysis techniques (Hotelling’s T-square, Principal Components, Factor Analysis, Profile Analysis, MANOVA); Categorical Data Analysis (contingency tables, measurement of association, log-linear models for counts, logistics and polytomous regression, GEE); and lifetime Data Analysis (Kaplan-Meier plots, logrank test, Cox regression). Annually, Winter AA/DM/AM
SurvMeth 687 (or 746): Applications of Statistical Modeling Advanced Statistical Modeling, designed for students on both the social science and statistical tracks for the two programs in survey methodology, will provide students with exposure to applications of more advanced statistical modeling tools for both substantive and methodological investigations that are not fully covered in other MPSM or JPSM courses. Modeling techniques to be covered include multilevel modeling (with an application to methodological studies of interviewer effects), structural equation modeling (with an application of latent class models to methodological studies of measurement error), classification trees (with an application to prediction of response propensity), and alternative models for longitudinal data (with an application to panel survey data from the Health and Retirement Study). Discussions and examples of each modeling technique will be supplemented with methods for appropriately handling complex sample designs when fitting the models. The class will focus on essential concepts, practical applications, and software, rather than extensive theoretical discussions. Annually, Fall AA/DM/AM
TO 618: Applied Business Analytics and Decisions  Applied Business Analytics and Decisions — Objective: Strategic and tactical decisions problems that firms face became too complex to solve by naive intuition and heuristics. Increasingly, making business decisions requires “intelligent” and “data oriented” decisions, aided by decision support tools and analytics. The ability to make such decisions and use available tools is critical for both managers and firms. In recent years, the toolbox of business analytics has grown. These tools provide the ability to make decisions supported by data and models. This course prepares students to model and manage business decisions with data analytics and decision models.  AA/DM/AM