Tag

scalable models

PyData June Meetup: Intro to Azure Machine Learning: Predict Who Survives the Titanic

By |

Join us for a PyData Ann Arbor Meetup on Thursday, June 8th, at 6 PM, hosted by TD Ameritrade and MIDAS.

Interested in doing machine learning in the cloud? In this demo-heavy talk, Jennifer Marsman will set the stage with some information on the different types of machine learning (clustering, classification, regression, and anomaly detection) supported by Azure Machine Learning and when to use each. Then, for the majority of the session, she’ll demonstrate using Azure Machine Learning to build a model which predicts survival of individuals on the Titanic (one of the challenges on the Kaggle website). She’ll talk through how she analyzes the given data and why she chooses to drop or modify certain data, so you will see the entire process from data import to data cleaning to building, training, testing, and deploying a model. You’ll leave with practical knowledge on how to get started and build your own predictive models using Azure Machine Learning.

Jennifer Marsman is a Principal Software Development Engineer in Microsoft’s Developer Experience group, where she educates developers on Microsoft’s new technologies with a focus on data science, machine learning, and artificial intelligence. Jennifer blogs at http://blogs.msdn.microsoft.com/jennifer and tweets at http://twitter.com/jennifermarsman.

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”

PyData May Meetup: Scalable, Distributed, and Reproducible Machine Learning

By |

Join us for a PyData Ann Arbor Meetup on Thursday, May 25th at 6 PM, hosted by TD Ameritrade and MIDAS.

The recent advances in machine learning and artificial intelligence are amazing!  Yet, in order to have real value within a company, data scientists must be able to get their models off of their laptops and deployed within a company’s data pipelines and infrastructure.  Those models must also scale to production size data. In this talk, we will implement a model locally in Python. We will then take that model and deploy both it’s training and inference in a scalable manner to a production cluster with Pachyderm, an open source framework for distributed pipelining and data versioning. We will also learn how to update the production model online, track changes in our model and data, and explore our results.

Daniel Whitenack (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

PyData Ann Arbor is a group for amateurs, academics, and professionals currently exploring various data ecosystems. Specifically, we seek to engage with others around analysis, visualization, and management. We are primarily focused on how Python data tools can be used in innovative ways but also maintain a healthy interest in leveraging tools based in other languages such as R, Java/Scala, Rust, and Julia.

PyData Ann Arbor strives to be a welcoming and fully inclusive group and we observe the PyData Code of Conduct. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States.

“use what you have learned to make something better and share with others”