ASA Conference: Women in Statistics and Data Science, La Jolla, California

By | | No Comments

The American Statistical Association invites you to join us at the 2017 Women in Statistics and Data Science Conference in La Jolla, California—the only conference for the field tailored specifically for women!

Join us to “share WISDOM (Women in Statistics, Data science, and -OMics).”

WSDS will gather professionals and students from academia, industry, and the government working in statistics and data science. Find unique opportunities to grow your influence, your community, and your knowledge.

Whether you are a student, early-career professional, or an experienced statistician or data scientist, this conference will deliver new knowledge and connections in an intimate and comfortable setting.

Learn More!

pydata

PyData June Meetup: Intro to Azure Machine Learning: Predict Who Survives the Titanic

By | | No Comments

Join us for a PyData Ann Arbor Meetup on Thursday, June 8th, at 6 PM, hosted by TD Ameritrade and MIDAS.

Interested in doing machine learning in the cloud? In this demo-heavy talk, Jennifer Marsman will set the stage with some information on the different types of machine learning (clustering, classification, regression, and anomaly detection) supported by Azure Machine Learning and when to use each. Then, for the majority of the session, she’ll demonstrate using Azure Machine Learning to build a model which predicts survival of individuals on the Titanic (one of the challenges on the Kaggle website). She’ll talk through how she analyzes the given data and why she chooses to drop or modify certain data, so you will see the entire process from data import to data cleaning to building, training, testing, and deploying a model. You’ll leave with practical knowledge on how to get started and build your own predictive models using Azure Machine Learning.

Jennifer Marsman is a Principal Software Development Engineer in Microsoft’s Developer Experience group, where she educates developers on Microsoft’s new technologies with a focus on data science, machine learning, and artificial intelligence. Jennifer blogs at http://blogs.msdn.microsoft.com/jennifer and tweets at http://twitter.com/jennifermarsman.

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”

pydata

PyData May Meetup: Scalable, Distributed, and Reproducible Machine Learning

By | | No Comments

Join us for a PyData Ann Arbor Meetup on Thursday, May 25th at 6 PM, hosted by TD Ameritrade and MIDAS.

The recent advances in machine learning and artificial intelligence are amazing!  Yet, in order to have real value within a company, data scientists must be able to get their models off of their laptops and deployed within a company’s data pipelines and infrastructure.  Those models must also scale to production size data. In this talk, we will implement a model locally in Python. We will then take that model and deploy both it’s training and inference in a scalable manner to a production cluster with Pachyderm, an open source framework for distributed pipelining and data versioning. We will also learn how to update the production model online, track changes in our model and data, and explore our results.

Daniel Whitenack (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”

2017 MICDE Annual Symposium

By | | No Comments

Please join us for the Michigan Institute for Computational Discovery and Engineering 2017 Symposium. The event features eminent scientists from around the world and the U-M campus. The symposium this year focuses on the “New Era of Data-Enabled Computational Science.”

Speakers:

  • Frederica Darema — Director, Air Force Office of Scientific Research
  • George Karniadakis —  Professor of Applied Mathematics, Brown University
  • Tinsley Oden Director of the Institute for Computational Engineering and Sciences, V.P. for Research, University of Texas at Austin
  • Karen Willcox — Professor of Aerospace and Aeronautics, Massachusetts Institute of Technology, co-Director of MIT Center for Computational Engineering
  • Jacqueline H. Chen — Distinguished Member of Technical Staff at the Combustion Research Facility, Sandia National Laboratories
  • Laura Balzano — Assistant Professor, Electrical Engineering and Computer Science, U-M
  • Emanuel Gull — Assistant Professor, Physics

The symposium features a poster competition and more. For more information and to register go to http://micde.umich.edu/symposium17/

Past Symposia

2016 MICDE Annual Symposium

Research Computing Symposium Fall 2014 

 

pydata

PyData April Meetup: Interactive Data Visualization in Jupyter Notebook Using bqplot

By | | No Comments

Join us for a PyData Ann Arbor Meetup on Thursday, April 13th at 6 PM, hosted by TD Ameritrade and MIDAS.

This month’s meetup will focus on bqplot which is a Python plotting library based on d3.js that offers its functionality directly in the Jupyter Notebook, including selections, interactions, and arbitrary css customization. In bqplot, every element of a chart is an interactive widget that can be bound to a python function, which serves as the callback when an interaction takes place. This allows the user to generate full fledged interactive applications directly in the Notebook with just a few lines of Python code. In the second part of the talk, drawing examples from fields like Data Science and Finance, we show examples of building interactive charts and dashboards using bqplot and the ipywidgets framework.

The talk will also cover bqplot’s interaction with the new JupyterLab IDE and what we plan for the future.

Presenter: Dhruv Madeka is a Quantitative Researcher at Bloomberg LP. His current research interests focus on Machine Learning, Quantitative Finance, Data Visualization and Applied Mathematics. Having graduated from the University of Michigan with a BS in Operations Research and from Boston University with an MS in Mathematical Finance, Dhruv is part of one of the leading research teams in Finance, developing models, software and tools for users to make their data analysis experience richer.

 

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”

NSF Federal Datasets Faculty Working Group

By | | No Comments

In response to a recent NSF solicitation (Dear Colleague Letter: Request for Input on Federal Datasets with Potential to Advance Data Science), the Michigan Institute for Data Science (MIDAS) invites Faculty to join a faculty working group to collaborate on a joint submission (deadline: March 31).

The NSF DCL working group will identify federal government data that will enhance and support the growing data science research community. We are being asked what federal data is of value for data science and machine learning that will have significant impact on science, engineering, education, and society.

If you have experience or interest in using federal datasets for your research, and would like to help shape how federal datasets can be preserved and utilized, please join this working group. We will discuss strategies for responding to NSF and potential funding (both federal and local) to support this effort. Please attend in person if possible.

RSVP

pydata

PyData March Meetup: Carol Willing, Jupyter Project

By | | No Comments

 

Join us for our first PyData Ann Arbor Meetup on Thursday, March 2nd at 6 PM, hosted by TD Ameritrade and MIDAS. Please RSVP.

 

This month, we are excited to host Carol Willing who will be discussing the Jupyter eco-sytem.  Carol develops software, electronics, educational tutorials, and is passionate about outreach.  She is a core developer on the Jupyter Project and is a former director at the Python Software foundation.  She continues to contribute her time to OpenHatch, Systers, PyLadies San Diego, and San Diego Python.

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”

Women in Data Science Conference — Feb. 3, Michigan League

By | General Interest, News | No Comments

In partnership with Stanford University, MIDAS will participate in the 2017 Women in Data Science Conference, with live speakers on campus and a simulcast of the conference proceedings from Stanford.

Speakers at the U-M event include Amy Cohn (COE), Stephanie Teasley (SI), Yi Lu Murphey (ECE-Dearborn), and Yao Xie (Georgia Institute of Technology).

For more information, including registration, visit the U-M WIDS page.

Undergrad Research Opportunity: Linking Survey and Big Data

By | Educational, General Interest, jobs | No Comments

Linking existing social survey data to administrative (big) data sources is a powerful way to expand the data available for sociological inquiry. This project pursues a range of different linkage projects. We will add historical Census data as well as rich data on housing from a real estate vendor to ongoing, large-scale survey studies of American families. These matched data will end up supporting exciting new opportunities for research on the long-term trends in economic wellbeing and the transmission of social inequality across generations in the United States.

Bing Liu, University of Illinois at Chicago – MIDAS Seminar Series

By | | No Comments

BingLiu

Bing Liu, PhD

University of Illinois, Chicago

Recorded Seminar

SLIDES

“Lifelong Machine Learning”

Abstract: Lifelong Machine Learning (or Lifelong Learning) is an advanced machine learning paradigm that learns continuously, accumulates the knowledge learned in the past, and uses it to help future learning. In the process, the learner becomes more and more knowledgeable and effective at learning. This learning ability is one of the hallmarks of human intelligence. However, the current dominant machine learning paradigm learns in isolation: given a training dataset, it runs a machine learning algorithm on the dataset to produce a model. It makes no attempt to retain the learned knowledge and use it in future learning. Although this isolated learning paradigm has been very successful, it requires a large number of training examples, and is only suitable for well-defined and narrow tasks. In comparison, we human can learn effectively with a few examples because we have accumulated so much knowledge in the past which enables us to learn with little data or effort. Lifelong learning aims to achieve this capability. As statistical machine learning matures, it is time to break the isolated learning tradition to study lifelong learning. Applications such as intelligent assistants, chatbots, and physical robots that interact with humans and systems in real-life environments are also calling for such lifelong learning capabilities. Without the ability to accumulate the learned knowledge and use it to learn more knowledge incrementally, a system will probably never be truly intelligent. In this talk, I will introduce lifelong learning, discuss related learning paradigms, and present some of our recent work on the topic.

 

Bio: Bing Liu is a professor of Computer Science at the University of Illinois at Chicago. He received his Ph.D. in Artificial Intelligence from the University of Edinburgh. His research interests include lifelong machine learning, sentiment analysis, data mining, machine learning, and natural language processing. He has published extensively in top conferences and journals in these areas. Two of his papers have received 10-year Test-of-Time awards from KDD, the premier conference of data mining and data science. He also authored four books: one on lifelong machine learning (coming later this month), one on Web data mining, and two on sentiment analysis. Some of his work has also been widely reported in the press, including a front-page article in the New York Times. On professional services, he serves as the current Chair of ACM SIGKDD. He has served as program chair of many leading data mining conferences, including KDD, ICDM, CIKM, WSDM, SDM, and PAKDD, as associate editor of leading journals such as TKDE, TWEB, and DMKD, and as area chair or senior PC members of numerous natural language processing, AI, Web research, and data mining conferences. He is a Fellow of ACM, AAAI and IEEE.

For more information on MIDAS or the Seminar Series, please contact midas-contact@umich.edu.

MIDAS gratefully acknowledges Northrop Grumman Corporation for its generous support of the MIDAS Seminar Series.