Approved Courses

Students must complete 9 credit hours of approved courses to earn the Graduate Data Science Certificate — at least 6 credits must be from the core Modeling and Technology courses. The rest must be outside of the student’s core discipline.

The Tables below show respectively the core course options and the complete list of courses in the Graduate Data Science Certificate Program.

Upon request, other courses may be allowed based on course availability, program demands and student needs.

Core Data Science Courses:

Legend: AA=Algorithms and Applications, DM=Data Management, AM=Analysis Methods.

Course Number Title Description Type
EECS 584: Advanced Database Management Systems Masters/Ph.D. level course for students in Computer Science, Electrical Engineering, and Information School DM
EECS 545: Machine Learning Foundations of machine learning, mathematical derivation and implementation of the algorithms, and their applications DM/AM
EECS 453: Applied Data Analysis Applied matrix algorithms for signal processing, data analysis and machine learning DM/AA
Math 571: Numerical Linear Algebra Numerical methods for solving linear algebra problems (linear systems and eigenvalue problems), matrix decompositions, and convex optimization AM
Stats 415: Data Mining and Statistical Learning This course covers the principles of data mining, exploratory analysis and visualization of complex data sets, and predictive modeling. The presentation balances statistical concepts (such as over-fitting data, and interpreting results) and computational issues. Students are exposed to algorithms, computations, and hands-on data analysis in the weekly discussion sessions. AM
Stats 503: Applied Multivariate Analysis The course covers methods for modern multivariate data analysis and statistical learning, including theoretical foundations and practical applications.   Topics include principal component analysis and other dimension reduction techniques, classification (discriminant analysis, decision trees, nearest neighbor classifiers, logistic regression, support vector machines, ensemble methods), clustering (agglomerative and partitioning methods, model-based methods), and categorical data analysis. There is a significant data analysis component. AM/AA
HS 853: Scientific Methods for Health Sciences: Special Topics Modern analytical methods for advanced healthcare research. Specific focus on innovative modeling, computational, analytic and visualization techniques to address concrete driving biomedical and healthcare applications. The course covers the 5 dimensions of Big-Data (volume, complexity, time/scale, source and management). AM/AA
BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics This course focuses on machine learning methods and their applications in biomedical sciences. Topics include:

1) data management solutions for Big Data applications.

2) feature extraction and reduction methods.

3) clustering and classification methods.

4) testing and validation of models.

5) applications in systems biology and clinical informatics.

AA/DM
BIOINF-580:

Biomedical Signal and Image Analysis

The course covers some fundamental methods in biomedical data analysis. Topics include:

  1. Database management in biomedical applications.
  2. Transforms and feature extraction, Fourier transform, wavelet transform, fundamentals of information theory, and statistical methods used in signal processing.
  3. Image enhancement, image segmentation, and image feature extraction methods.
  4. Brief introduction to natural language processing.
  5. Introduction to fundamental techniques in clustering and classification.
  6. Applications in medicine and biology.
 AA/AM
 SI 608: Networks: Theory and Application Advanced course for master students of information and health informatics  AM/DM
 SI 650/EECS 549: Information Retrieval Advanced course for graduate students of information, health informatics, and CS  DM/AA
 Biostats 646: High Throughput Molecular Genetic and Epigenetic Data Analysis Data analysis in experimental molecular biology, gene expression, genome sequence and epigenomics data  AM/AA
 TO 640: Big Data Management: Tools and Techniques This course teaches the basic tools in acquisition, management, and visualization of large data sets. Students will learn how to: store, manage, and query databases via SQL; quickly construct insightful visualizations of multi-attribute data using Tableau; use the Python programming language to manage data as well as connect to APIs to efficiently acquire public data. After taking this course, students will be able to efficiently construct large data sets that source underlying data from multiple sources, and form initial hypotheses based on visualization.  AA/DM
 Psych 613: Advanced Statistical Methods I “general linear model” with particular emphasis on exploratory data analysis, contrast analysis, residual analysis, and Euclidean distance. The topics covered over the two semesters include analysis of variance, regression, categorical data analysis, principal components analysis, multidimensional scaling, cluster analysis, multivariate ANOVA, canonical correlation, and structural equations modelling  AM/AA
 Psych 614: Advanced Statistical Methods II Topics covered in this course include multidimensional scaling, cluster analysis, principal components, factor analysis, multivariate analysis of variance and canonical correlation. A brief introduction to reliability theory, structural equations modeling and hierarchical linear modeling will also be provided  AM/AA
 LHS 610 Learning from Health Data: Applied Data Science in Health
  • Rigid vs. flexible models
  • Decision trees
  • K-nearest neighbors
  • Perceptron
  • Support vector machines
  • Introduction to caret: training and predicting
 AA/DM
 Biostat 521: Applied Biostatistics I Fundamental statistical concepts related to the practice of public health: descriptive statistics; probability; sampling; statistical distributions; estimation; hypothesis testing; chi-square tests; simple and multiple linear regression; one-way ANOVA. . Taught at a more advanced mathematical level than Biostat 503. Use of the computer in statistical analysis.  AM/AA
 Biostat 522: Applied Biostatistics II A second course in applied biostatistical methods and data analysis. Concepts of data analysis and experimental design for health-related studies. Emphasis on categorical data analysis, multiple regression, analysis of variance and covariance.  AM/DM
 Biostat 650: Applied Statistics I: Linear Regression Graphical methods, simple and multiple linear regression; simple, partial and multiple correlation; estimation; hypothesis testing, model building and diagnosis; introduction to nonparametric regression; introduction to smoothing methods (e.g., lowess) The course will include applications to real data.  AM/AA

 

Complete List of Data Science Courses

AA=Algorithms and Applications, DM=Data Management, AM=Analysis Methods
Course Number Title Description Offering AA/AM /DM
HS 851: Scientific Methods for Health Sciences: Applied Inference Applied inference methods in studies involving multiple variables. Specific methods that will be discussed include linear regression, analysis of variance, and different regression models. This course emphasizes the scientific formulation, analytical modeling, computational tools and applied statistical inference in diverse health-sciences problems. Hands-on data interrogation, modeling approaches, rigorous interpretation and inference. Fall

annually

AM
HS 853: Scientific Methods for Health Sciences: Special Topics Modern analytical methods for advanced healthcare research. Specific focus on innovative modeling, computational, analytic and visualization techniques to address concrete driving biomedical and healthcare applications. The course covers the 5 dimensions of Big-Data (volume, complexity, time/scale, source and management). Fall

annually

AM/AA
BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics This course focuses on machine learning methods and their applications in biomedical sciences. Topics include:1) data management solutions for Big Data applications.

2) feature extraction and reduction methods.

3) clustering and classification methods.

4) testing and validation of models.

5) applications in systems biology and clinical informatics.

Fall

annually

AA/DM
BIOINF 527: Introduction to Bioinformatics and Computational Biology Intensive MS/PhD course for Bioinformatics students on Big Data projects Winter

annually

AA
BIOINF 699: Biomedical Signal and Image Processing Masters/Ph.D. level course for students in Computational Medicine and Bioinformatics, and Engineering Fall

alt. years

AA
BIOINF 699: Machine Learning for Systems Biology and Clinical Informatics Masters/Ph.D. level course for students in Computational Medicine and Bioinformatics, and Engineering Fall

alt. years

AA/AM
EECS 453: Applied Data Analysis Applied matrix algorithms for signal processing, data analysis and machine learning AA/AM
EECS 484: Database Management Systems Undergraduate level course for Computer Science and Engineering students Winter

annually

DM
EECS  551: Matrix Methods for Signal Processing, Data Analysis & Machine Learning Theory and application of matrix methods to signal processing, data analysis and machine learning. Theoretical topics include subspaces, eigenvalue and singular value decomposition, projection theorem, constrained, regularized and unconstrained least squares techniques and iterative algorithms. Applications such as image deblurring, ranking of webpages, image segmentation and compression, social networks, circuit analysis, recommender systems and handwritten digit recognition. Note: EECS 453 and EECS 551 share same lectures but have different recitations.  AA/AM
EECS 498 Advanced Signal Processing & Applications  AA
EECS453: Applied Data Analysis Applied matrix algorithms for signal processing, data analysis and machine learning  AA/AM
EECS 584: Advanced Database Management Systems Masters/Ph.D. level course for students in Computer Science, Electrical Engineering, and Information School Fall

annually

DM
EECS 684: Current Topics in Databases Masters/Ph.D. level course for students in Computer Science and Electrical Engineering Winter

biennially

AA/DM/AM
EECS 445/545: Introduction to Machine Learning Programming-focused introduction to Machine Learning Annually AA/AM
EECS 545: Machine Learning Foundations of machine learning, mathematical derivation and implementation of the algorithms, and their applications Biannually AA/AM
EECS 598: Unsupervised Feature Learning Principles and progress in unsupervised feature learning algorithms for machine learning applications. Topics include clustering, sparse coding, autoencoders, restricted Boltzmann machines, and deep belief networks Biannually AA/AM/DM
EECS 598-007: Practical Machine Learning Basics of practical machine learning and data mining while focusing on real-world applications. ML in the fields of sports analytics, data-driven medicine, finance, and personalized education Annually AA/AM/DM
EECS 598-006: Probabilistic Analysis of Large Scale Systems Emerging topics in epidemics and diffusions, queueing systems, analysis of randomized algorithms, Bayesian information cascades, network analysis and random graphs. Annually AA/DM
EECS 598-005: Theoretical Machine Learning Understanding the mathematical underpinnings of machine learning. Fall AA/AM
IOE 511/MATH 562: Continuous Optimization Methods Introduction to concepts and methods of constrained and unconstrained continuous nonlinear optimization.  The course revolves around three issues in optimization:  building optimization models of problems,  characterization of their solutions, and algorithms for finding these solutions.  As the semester progresses, I will compile a list of topics of each lecture on the web site.  The outline of the topics cover: Introduction to optimization, optimality conditions for unconstrained problems, algorithms  for  unconstrained  problems  (steepest  descent,  Newton’s,  etc.)   and  analysis  of their convergence, optimality conditions and constraint qualifications for constrained problems, convexity and its role in optimization, algorithms for constrained problems (SQP, barrier and penalty methods, etc.), conic optimization problems, their applications, and methods for their solution. Winter AA/AM
LHS 712: Natural Language Processing for Health Students in this course will learn advanced techniques to parse and collate information from text-rich health documents such as electronic health records, clinical notes, and peer-reviewed medical literature. In this elective, students will be able to delve deeper into challenges in recognizing medical entities in text documents, extracting clinical information, addressing ambiguity and polysemy, and building searchable interfaces to efficiently and effectively query and retrieve relevant patient data. Students will develop tools and techniques to analyze new genres of health information, and build resources to help in these tasks. Students will also participate in a semester-long project on addressing specific natural language processing challenges in real-life health data sets. Winter AA/AM/DM
Physics 514: Computational Physics Computational Physics graduate seminar, including an Introduction to Python mini-course at the start of the semester Annually AA
Psych 613: Advanced Statistical Methods I 1st Doctorate level course for Psychology, with students from Public Health, Education, Business, Communication Studies, Kinesiology, Neuroscience Fall

annually

AM
Psych 614: Advanced Statistical Methods II 2nd Doctoral level course, same enrollment as Psych 613 Winter

annually

AM
SI 608: Networks – Theory and Application Advanced course for master students of information and health informatics Winter

annually

AA/DM
SI 650/EECS 549: Information Retrieval Advanced course for graduate students of information, health informatics, and CS Winter

annually

AA/DM/AM
SI 671/SI721: Data Mining – Methods and Applications Advanced masters level course /doctoral course for students in information sciences Fall

annually

AA/AM
SI 649: Information visualization  Image models, multidimensional and multivariate data, design principles for visualization, hierarchical, network, textual and collaborative visualization, and visualization pipeline, data processing for visualization, visual representations, visualization system interaction design, and impact of perception On Demand DM
Complex Systems 535: Network theory  Social networks, the world wide web, information and biological networks; methods and computer algorithms for the analysis and interpretation of network data; graph theory; models of networks including random graphs and preferential attachment models; spectral methods and random matrix theory; maximum likelihood methods; percolation theory; network search On Demand AA
SI 601: Data manipulation  Data harvesting, processing, aggregation, design and evaluation of analytic solutions, multisource data, automated gathering of data, parsing, summarization, and exploratory data analytics (use of Python modules) On Demand DM
SI 618: Exploratory data analysis  Converting messy data (using R), connecting to cloud databases, computation and visualization, summary statistics, ‘grammar of graphics’, graphical aesthetics, data stratification, factor analysis of categorical data On Demand AM
Coursera: Social network analysis  Social network analysis, theoretical models and computational tools, interpretation of social and information networks. On Demand (MOOC) AA/AM/DM
Math 651: Topics in Applied Mathematics Sparse analysis, compressive sensing and data modeling Winter

annually

AA/AM
Math 471: Introduction to Numerical Methods Numerical methods for solving practical scientific problems involving accuracy, stability, efficiency and convergence Fall

annually

AA/AM
Math 571: Numerical Linear Algebra Numerical methods for solving linear algebra problems (linear systems and eigenvalue problems), matrix decompositions, and convex optimization Fall annually AA/AM
Stats 415: Data Mining and Statistical Learning This course covers the principles of data mining, exploratory analysis and visualization of complex data sets, and predictive modeling. The presentation balances statistical concepts (such as over-fitting data, and interpreting results) and computational issues. Students are exposed to algorithms, computations, and hands-on data analysis in the weekly discussion sessions. Annually AA/AM
Stats 503: Applied Multivariate Analysis The course covers methods for modern multivariate data analysis and statistical learning, including theoretical foundations and practical applications.   Topics include principal component analysis and other dimension reduction techniques, classification (discriminant analysis, decision trees, nearest neighbor classifiers, logistic regression, support vector machines, ensemble methods), clustering (agglomerative and partitioning methods, model-based methods), and categorical data analysis. There is a significant data analysis component. Annually AM
Stats 605: Advanced Topics in Modeling and Data Analysis Topics include: (1) classification and machine learning, including support vector machines, recursive partitioning, and ensemble methods; (2) methods for analyzing sets of curves, surfaces and images, including functional data analysis, wavelets, independent component analysis, and random field models; (3) modern regression, including splines and generalized additive models, (4) methods for analyzing structured dependent data, including mixed effects models, hierarchical models, graphical models, and Bayesian networks; and (5) clustering, detection, and dimension reduction methods, including manifold learning, spectral clustering, and bump hunting Annually AA/DM/AM
Biostats 646: High Throughput Molecular Genetic and Epigenetic Data Analysis Data analysis in experimental molecular biology, gene expression, genome sequence and epigenomics data Annually AA/AM
Biostats 696: Spatial Statistics Theory and methods of spatial and spatio-temporal statistics, modeling and inference on spatial processes within a geostatistical and a hierarchical Bayesian framework Annually AM
Biostats 615: Statistical Computing Practical understanding of computational aspects in implementing statistical methods. Annually AA/DM/AM
TO 618: Applied Business Analytics and Decisions  Applied Business Analytics and Decisions — Objective: Strategic and tactical decisions problems that firms face became too complex to solve by naive intuition and heuristics. Increasingly, making business decisions requires “intelligent” and “data oriented” decisions, aided by decision support tools and analytics. The ability to make such decisions and use available tools is critical for both managers and firms. In recent years, the toolbox of business analytics has grown. These tools provide the ability to make decisions supported by data and models. This course prepares students to model and manage business decisions with data analytics and decision models.  AA/DM/AM
TO 628: Advanced Big Data Analytics With the ongoing explosion in availability of large and complex business datasets (“Big Data”), Machine Learning (“ML”) algorithms are increasingly being used to automate the analytics process and better manage the volume, velocity and variety of Big Data. This course teaches how to apply the growing body of ML algorithms to various Big Data sources in a business context. AA/DM/AM