Each year eight Medallion Lecturers are chosen from across all areas of statistics and probability by the IMS Committee on Special Lectures. The Medallion nomination is an honor and an acknowledgment of a significant research contribution to one or more areas of research. Each Medallion Lecturer will receive a Medallion in a brief ceremony preceding the lecture.
This summer, 10 high school students from around the country gathered in Ann Arbor for the first annual Michigan Institute for Data Science Summer Camp on the campus of the University of Michigan.
The weeklong camp, titled “From Simple Building Blocks to Complex Shapes: A Visual Tour of Fourier Series,” drew students from as far away as Kansas City, MO, and as nearby as Ypsilanti and Ann Arbor.
The camp was organized by Raj Nadakuditi, assistant professor in the Electrical Engineering and Computer Science Department. Other U-M faculty instructors at the camp were Prof. Jenna Weins, and MIDAS co-directors Prof. Al Hero and Prof. Brian Athey.
The camp was well received by the participants, who ranged from high school sophomores to seniors. A total of 10 students attended, five boys and five girls. Students used the Fourier Series to make art, diagnose disease, and “play detective.”
“I’ve been looking to learn about what been going on with Big Data,” said Daniel Neamati, a 16-year-old from Ann Arbor who hopes to someday study deep space with NASA. “I was really surprised by this camp. Math is basically everywhere.”
Elizabeth Fitzgerald, 16, traveled from South Carolina to take part in the camp. She said she wants to study artificial intelligence and machine learning, but was interested to see what else data science can explain.
“It was enlightening to see all the different applications of data science,” she said.
The camp will be offered annual. Details for next year will be posted at http://midas.umich.edu/camp/ in the coming months.
As the use of data science techniques continues to grow across disciplines, a group of University of Michigan researchers are working to build a community of social scientists with skills in Big Data through a week-long summer camp for faculty and graduate students.
Having recently completed its fourth annual session, the Big Data Summer Camp held by the Interdisciplinary Committee for Organizational Studies (ICOS) trains approximately 50 people each spring in skills and methods such as Python, SQL, and social media APIs. The camp splits up into several groups to try to answer a research question using these newly acquired skills.
Working with researchers from other fields is a key component of the camp, and of creating a Big Data social science community, said co-coordinator Todd Schifeling, a Research Fellow at the Erb Institute in the School of Natural Resources and Environment.
“Students meet from across social science disciplines who wouldn’t meet otherwise,” said Schifeling. “And every year we bring back more and more past campers to present on what they’ve been doing.”
Schifeling himself participated in the camp as a student before taking on the role of coordinator this year.
Teddy DeWitt, the other co-coordinator of the camp and a doctoral student at the Ross School of Business, added the camp presents the curriculum in a unique way relative to the rest of campus.
“This set of material does not seem to be available in other parts of the university, at least … with an applied perspective in mind,” he said. “So we’re glad we have this set of resources that is both accessible and well-received by students.”
Participants range in skill from beginning to advanced, but even a relatively advanced student like Jeff Lockhart, a doctoral student in sociology and population studies who describes himself as “super-committed to computational social science,” said that it’s hard to find classes in computational methods in social science departments.
“[The ICOS camp] doesn’t expect a lot of prior knowledge, which I think is critical,” Lockhart said.
Lockhart, DeWitt, and Dylan Nelson, also a sociology doctoral student, are working on setting up a series of workshops in Computational Social Science for fall 2016 (contact Lockhart at firstname.lastname@example.org for more information). Lockhart said it’s critical that social scientists learn Big Data skills.
“If we don’t have skills like this, there’s no way for us to enter into these fields of research that are going to be more and more important,” he said.
“A lot of the skills we’ve learned are sort of the on-ramp for doing data science,” DeWitt added.
The camp is co-sponsored by Advanced Research Computing (ARC).
CALL FOR ISSUE EXPERTS AND SPONSORS
The Great Lakes Observing System (GLOS) is hosting the Great Lakes Data Challenge in summer of 2016. As part of our 10 year anniversary, GLOS will be taking open data to the next level by using open innovation to broaden our community and create new partnerships to engage people in problem solving for the Great Lakes. GLOS is currently soliciting support for sponsors and issue experts.
- Inspire a wider audience to engage with Great Lakes issues
- Use technologies, innovation and creativity to solve Great Lakes problems
- Encourage the use of open data resources from GLOS and beyond
- Late May 2016: Launch challenge
- June: Kick-off event(s), including IAGLR
- August 15: Submissions due. Submissions can include an app, data “mash-up”, visualization, story, or other innovative idea for using, collecting, analyzing, visualizing, and/or communicating Great Lakes data.
- August 15-31: Judging
- September 15: Winners notified
- October 12-13: Award presentation at GLOS Annual Meeting in Ann Arbor, MI
- Baseline prize money: $5,000
- Data, technical support, and resources for developer guidelines, rules, etc.
- Data Challenge(s) coordination
WE NEED YOU
Sponsors: by May 20 The Great Lakes Data Challenge is a unique opportunity to network the region’s
environmental, governmental and non-profit sectors with the information technology sector. Sponsors must commit by May 20 to ensure inclusion in event promotions.
Consider sponsoring the challenge at one of our suggested levels (see next page) to help support prize
money, event costs, and promotional giveaways. This is a great way to promote your business/organization to a diverse audience of environmental data and technology stakeholders.
Issue Experts: by June 1 We are looking for volunteers with expertise in areas including invasive species, nutrients and algae and boater safety, among others. You would agree to be a resource to teams who have specific questions about the topic at hand. The commitment could be flexible according to your interest and availability.
Please contact GLOS at email@example.com if you are interested in supporting the data challenge in any of these areas.
Be a part of the Great Lakes Observing System’s Data Challenge
- SUPERIOR $5,000
All lower level sponsorship benefits as well as…
Top billing as Data Challenge co-sponsor in all event promotions and media releases
Large, prominent logo on event giveaways, promotional signage, and website
- MICHIGAN $2,500
All lower level sponsorship benefits as well as…
Acknowledgement as co-sponsor for a custom challenge category
Logo on event giveaways
- HURON $1,000
All lower level sponsorship benefits as well as…
Sponsorship acknowledgement at promotional events including kick-off and award presentation
Logo on Data Challenge website and promotional signage
- ONTARIO $500
All lower level sponsorship benefits as well as…
Sponsorship acknowledgement on promotional signage
Complimentary individual (for 1 person) GLOS membership and registration to the GLOS Annual Meeting
- ERIE $250
Sponsorship acknowledgement and website link on Data Challenge website
Acknowledgement in GLOS Annual Report
University of Michigan Health System & Peking University Health Science Center
Joint Institute for Translational and Clinical Research
Call for Poster Abstracts: Submission Information
Showcase your research to Peking University Health Science Center counterparts as the JI looks to expand by offering funding to non-medical school faculty for new health-related joint research projects. A great venue to meet potential collaborators, the poster session will be Thursday, Oct. 13. Details, including times, will follow poster acceptance.
How to submit
Abstracts should relate to clinical and translational research studies and should be submitted electronically in a Microsoft Word document.
Send abstracts to firstname.lastname@example.org by September 9, 2016. Please include the following:
- Title should be brief but should not contain abbreviations.
- Do not bold use letters in the title unless necessary.
- Do not capitalize all letters in title, only the first word and key words.
- Include all authors and their affiliations. To associate authors and their institutional affiliations, please place a number in parenthesis after each author’s name (if more than one author) and the corresponding number before each affiliated institution’s name (if more than one institution).
- Put the submitting/presenting author’s name in bold.
- Do not capitalize all letters in speaker information, only as appropriate.
- Abstracts are limited to 300 words. Use size 11 Arial or Calibri font.
- Submit text only. Do not include tables, graphics, or charts.
- Do not include title, authors, or author affiliations in the abstract text.
- Abstracts may include background, methods, results, conclusions, and funding-source acknowledgements, if applicable.
Submitter contact information
- First and last name, degrees
- Email address
Please proofread carefully – information submitted with errors may be published as is. Use a word processing program to assist with checking for grammar and spelling errors, as well as word count.
The deadline to submit abstracts is Sept. 9, 2016. For more information, contact email@example.com.
Researchers across campus now have access to several new services to help them navigate the new tools and methodologies emerging for data-intensive and computational research.
As part of the U-M Data Science Initiative announced in fall 2015, Consulting for Statistics, Computing and Analytics Research (CSCAR) is offering new and expanded services, including guidance on:
- Research methodology for data science.
- Large scale data processing using high performance computing systems.
- Optimization of code and use of Flux and other advanced computing systems.
- Advanced data management.
- Geospatial data analyses.
- Exploratory analysis and data visualization.
- Obtaining licensed data from commercial sources.
- Scraping, aggregating and integrating data from public sources.
- Analysis of restricted data.
“With Big Data and computational simulations playing an ever-larger role in research in a variety of fields, it’s increasingly important to provide researchers with a comprehensive ecosystem of support and services that address those methodologies,” said CSCAR Director Kerby Shedden.
As part of this significant expansion of its scope, the campuswide statistical consulting service CSCAR has been renamed Consulting for Statistics, Computing and Analytics Research. It was formerly known as the Center for Statistical Consultation and Research.
For more information, see the University Record article.
Flux is the shared computing cluster available across campus, operated by Advanced Research Computing – Technology Services (ARC-TS). Under ARC-TS’s new Flux for Undergraduates program, student groups and individuals with faculty sponsors can access unused computing cycles on Flux for free.
The first student group to take advantage of this program is the Michigan Data Science Team, which was created in Fall 2015 with the goal of helping U-M students enter Big Data competitions. The team enters competitions through sites like Kaggle, and is one of the first such teams affiliated with a university.
The group’s organizer, Jonathan Stroud, a Computer Science and Engineering graduate student, said team members were maxing out the capabilities of their laptops when they first started.
“For the first couple of competitions, we made sure we picked a problem that people could do on their laptops. Still, every night before bed, they would set up their experiments and they ran all night.”
— Jonathan Stroud
He said success in the data science competitions typically depends on trying several approaches simultaneously, which can be taxing on computing resources. Stroud said the team typically uses software such as Python, R, and Matlab. Team members come from a wide range of disciplines, including Engineering, Applied Math, Physics, and one from the Music School, Stroud said.
Jacob Abernethy, assistant professor of Electrical Engineering and Computer Science, is the group’s faculty advisor. He wrote some funding for the group into his NSF CAREER proposal that was awarded in 2015. He said after the group’s first competition, he surveyed the students as to what worked and what didn’t. He said one of the clearest responses was the need for more robust computing resources.
“Our top two competitors talked about maxing out the resources on not only their own laptop, but also on the clusters provided them by their advisors,” Abernethy said. “It became clear that we needed to talk about Flux.”
He said a key method to the machine learning and data science experimentation process is the use of cross-validation, that is, testing the performance of a set of parameters on several subsets of data simultaneously. “This leads to a very obvious need for a distributed system in which we can execute a large number of ‘embarrassingly parallel’ tasks quickly,” Abernethy said.
Being able to use Flux “has been helping us a lot,” Stroud added. “We’ve been contacted by other schools to see how they can do the same thing.”
Jobs submitted under Flux For Undergraduates will run only when unused cycles are available and will be requeued when those resources are needed by standard Flux jobs. To be most efficient, student groups should use short or checkpointed jobs to take advantage of these available cycles.
Student groups can also purchase Flux allocations for jobs that are higher priority or time constrained; those allocations can also work in conjunction with the free Flux for Undergraduates jobs.
“The goal is to provide undergraduates with experience in high performance computing, and access to computational resources for their projects,” said Brock Palen, Associate Director of ARC-TS.
Undergraduate groups and individuals must have sponsorship from a faculty member. To request resources through Flux for Undergraduates, please fill out this form. An abstract of the intended activity must be submitted.