HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets

By | Research

Title

HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets

Publication
Journal of Statistical Computation and Simulation

Date

11 Nov. 2018

DOI
https://doi.org/10.1080/00949655.2018.1545228

Authors
Simeone Marino, Nina Zhou, Yi Zhao, Lu Wang, Qiucheng Wu & Ivo D. Dinov (2019)

Abstract
There are no practical and effective mechanisms to share high-dimensional data including sensitive information in various fields like health financial intelligence or socioeconomics without compromising either the utility of the data or exposing private personal or secure organizational information. Excessive scrambling or encoding of the information makes it less useful for modelling or analytical processing. Insufficient preprocessing may compromise sensitive information and introduce a substantial risk for re-identification of individuals by various stratification techniques. To address this problem, we developed a novel statistical obfuscation method (DataSifter) for on-the-fly de-identification of structured and unstructured sensitive high-dimensional data such as clinical data from electronic health records (EHR). DataSifter provides complete administrative control over the balance between risk of data re-identification and preservation of the data information. Simulation results suggest that DataSifter can provide privacy protection while maintaining data utility for different types of outcomes of interest. The application of DataSifter on a large autism dataset provides a realistic demonstration of its promise practical applications.

ACNN Big Data Neuroscience Workshop

By |

BIG DATA NEUROSCIENCE WORKSHOP

Organized by Advanced Computational Neuroscience Network (ACNN)

Registration

Come join the ACNN Big Data Neuroscience Workshop and enjoy:

❖ Keynotes and Invited Talks
❖ Data Sharing Initiatives
❖ Demonstration of Neuroscience Computational Platforms
❖ Reproducibility Best Practices
❖ Learning Environment for Students and Early-Career Researchers

Students, trainees, fellows, junior investigators from the Midwest as well outside academic institutions and industry partners are invited.

Environmental Data Commons Workshop — June 9, Chicago

By | General Interest, News

The Center for Data Intensive Science at the University of Chicago is hosting a one day workshop in Chicago on June 9, 2016 on environmental data commons and data sharing.

There will be sessions on the environmental commons, services for environmental commons, environmental data commons applications, the NOAA Big Data Alliance, and interoperability of environmental commons, clouds, and repositories.

To register and for more information including workshop location, agenda, and options for lodging, please visit the event website.