New Data Science Computing Platform Available to U-M Researchers

By | General Interest, Happenings, HPC, News | No Comments

Advanced Research Computing – Technology Services (ARC-TS) is pleased to announce an expanded data science computing platform, giving all U-M researchers new capabilities to host structured and unstructured databases, and to ingest, store, query and analyze large datasets.

The new platform features a flexible, robust and scalable database environment, and a set of data pipeline tools that can ingest and process large amounts of data from sensors, mobile devices and wearables, and other sources of streaming data. The platform leverages the advanced virtualization capabilities of ARC-TS’s Yottabyte Research Cloud (YBRC) infrastructure, and is supported by U-M’s Data Science Initiative launched in 2015. YBRC was created through a partnership between Yottabyte and ARC-TS announced last fall.

The following functionalities are immediately available:

  • Structured databases:  MySQL/MariaDB, and PostgreSQL.
  • Unstructured databases: Cassandra, MongoDB, InfluxDB, Grafana, and ElasticSearch.
  • Data ingestion: Redis, Kafka, RabbitMQ.
  • Data processing: Apache Flink, Apache Storm, Node.js and Apache NiFi.

Other types of databases can be created upon request.

These tools are offered to all researchers at the University of Michigan free of charge, provided that certain usage restrictions are not exceeded. Large-scale users who outgrow the no-cost allotment may purchase additional YBRC resources. All interested parties should contact hpc-support@umich.edu.

At this time, the YBRC platform only accepts unrestricted data. The platform is expected to accommodate restricted data within the next few months.

ARC-TS also operates a separate data science computing cluster available for researchers using the latest Hadoop components. This cluster also will be expanded in the near future.

ARC-TS seeks input on next generation HPC cluster

By | Events, Flux, General Interest, Happenings, HPC, News | No Comments

The University of Michigan is beginning the process of building our next generation HPC platform, “Big House.”  Flux, the shared HPC cluster, has reached the end of its useful life. Flux has served us well for more than five years, but as we move forward with replacement, we want to make sure we’re meeting the needs of the research community.

ARC-TS will be holding a series of town halls to take input from faculty and researchers on the next HPC platform to be built by the University.  These town halls are open to anyone and will be held at:

  • College of Engineering, Johnson Room, Tuesday, June 20th, 9:00a – 10:00a
  • NCRC Bldg 300, Room 376, Wednesday, June 21st, 11:00a – 12:00p
  • LSA #2001, Tuesday, June 27th, 10:00a – 11:00a
  • 3114 Med Sci I, Wednesday, June 28th, 2:00p – 3:00p

Your input will help to ensure that U-M is on course for providing HPC, so we hope you will make time to attend one of these sessions. If you cannot attend, please email hpc-support@umich.edu with any input you want to share.

New private insurance claims dataset and analytic support now available to health care researchers

By | General Interest, Happenings, HPC, News | No Comments

The Institute for Healthcare Policy and Innovation (IHPI) is partnering with Advanced Research Computing (ARC) to bring two commercial claims datasets to campus researchers.

The OptumInsight and Truven Marketscan datasets contain nearly complete insurance claims and other health data on tens of millions of people representing the US private insurance population. Within each dataset, records can be linked longitudinally for over 5 years.  

To begin working with the data, researchers should submit a brief analysis plan for review by IHPI staff, who will create extracts or grant access to primary data as appropriate.

CSCAR consultants are available to provide guidance on computational and analytic methods for a variety of research aims, including use of Flux and other UM computing infrastructure for working with these large and complex repositories.

Contact Patrick Brady (pgbrady@umich.edu) at IHPI or James Henderson (jbhender@umich.edu) at CSCAR for more information.

The data acquisition and availability was funded by IHPI and the U-M Data Science Initiative.