Explore ARCExplore ARC

Study on bias in learning analytics earns Brooks Best Full Research Paper Award at LAK conference

By | General Interest, Happenings, News, Research

A paper co-authored by University of Michigan School of Information research assistant professor Christopher Brooks received the Best Full Research Paper Award at the International Conference on Learning Analytics & Knowledge (LAK) Conference in Tempe, Arizona. The award was announced on the final day of the conference, March 7, 2019.

The paper, “Evaluating the Fairness of Predictive Student Models Through Slicing Analysis,” describes a tool designed to test the bias in algorithms used to predict student success.

The goal of the paper, Brooks says, was to evaluate whether the algorithms used to predict whether students would succeed in massive online courses (MOOCs) was skewed by the gender makeup of the classes.

“We were able to find that some have more bias than others do,” says Brooks. “First we were able to show that different MOOCs tend to have different bias in gender representation inside of the MOOCs.”

Read more…

Women in HPC launches mentoring program

By | Educational, General Interest, HPC, News

Women in High Performance Computing (WHPC) has launched a year-round mentoring program, providing a framework for women to provide or receive mentorship in high performance computing. Read more about the program at https://womeninhpc.org/2019/03/mentoring-programme-2019/

WHPC was created with the vision to encourage women to participate in the HPC community by providing fellowship, education, and support to women and the organizations that employ them. Through collaboration and networking, WHPC strives to bring together women in HPC and technical computing while encouraging women to engage in outreach activities and improve the visibility of inspirational role models.

The University of Michigan has been recognized as one of the first Chapters in the new Women in High Performance Computing (WHPC) Pilot Program. Read more about U-M’s chapter at https://arc.umich.edu/whpc/

HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets

By | Research

Title

HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets

Publication
Journal of Statistical Computation and Simulation

Date

11 Nov. 2018

DOI
https://doi.org/10.1080/00949655.2018.1545228

Authors
Simeone Marino, Nina Zhou, Yi Zhao, Lu Wang, Qiucheng Wu & Ivo D. Dinov (2019)

Abstract
There are no practical and effective mechanisms to share high-dimensional data including sensitive information in various fields like health financial intelligence or socioeconomics without compromising either the utility of the data or exposing private personal or secure organizational information. Excessive scrambling or encoding of the information makes it less useful for modelling or analytical processing. Insufficient preprocessing may compromise sensitive information and introduce a substantial risk for re-identification of individuals by various stratification techniques. To address this problem, we developed a novel statistical obfuscation method (DataSifter) for on-the-fly de-identification of structured and unstructured sensitive high-dimensional data such as clinical data from electronic health records (EHR). DataSifter provides complete administrative control over the balance between risk of data re-identification and preservation of the data information. Simulation results suggest that DataSifter can provide privacy protection while maintaining data utility for different types of outcomes of interest. The application of DataSifter on a large autism dataset provides a realistic demonstration of its promise practical applications.

Balzano wins NSF CAREER award for research on machine learning and big data involving physical, biological and social phenomena

By | General Interest, Happenings, News, Research

Prof. Laura Balzano received an NSF CAREER award to support research that aims to improve the use of machine learning in big data problems involving elaborate physical, biological, and social phenomena. The project, called “Robust, Interpretable, and Efficient Unsupervised Learning with K-set Clustering,” is expected to have broad applicability in data science.

Modern machine learning techniques aim to design models and algorithms that allow computers to learn efficiently from vast amounts of previously unexplored data, says Balzano. Typically the data is broken down in one of two ways. Dimensionality-reduction uses an algorithm to break down high-dimensional data into low-dimensional structure that is most relevant to the problem being solved. Clustering, on the other hand, attempts to group pieces of data into meaningful clusters of information.

However, explains Balzano, “as increasingly higher-dimensional data are collected about progressively more elaborate physical, biological, and social phenomena, algorithms that aim at both dimensionality reduction and clustering are often highly applicable, yet hard to find.”

Balzano plans to develop techniques that combine the two key approaches used in machine learning to decipher data, while being applicable to data that is considered “messy.” Messy data is data that has missing elements, may be somewhat corrupted, or is filled heterogeneous information – in other words, it describes most data sets in today’s world.

Balzano is an affiliated faculty member of both the Michigan Institute for Data Science (MIDAS) and the Michigan Institute for Computational Discovery and Engineering (MICDE). She is part of a MIDAS-supported research team working on single-cell genomic data analysis.

Read more about the NSF CAREER award…