Loading Events
  • This event has passed.

Introduction to Apache Spark: Lan Jiang, Databricks

June 18, 2020 @ 12:00 pm - 2:00 pm

Introduction to ApacheSpark

Lan Jiang, Regional Lead of Resident Solutions, Databricks

Description of Lecture:  Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. It was originally developed at UC Berkeley in 2009. Since its release, Apache Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. It has quickly become the largest open source community in big data, with over 1000 contributors from 250+ organizations. The Apache Spark ecosystem includes Spark Core, SparkSQL, Spark Streaming, SparkML and GraphX and supports languages including Java, Scala, Python, R, SQL. The lecture is divided into two parts. The first part of the lecture introduces the Spark fundamentals and core concepts via slides. The second part of the lecture uses Databricks notebook to provide hands-on examples of how to read/transform/write data in the Spark and some advanced topics such as performance optimization. To follow the second part of the hands-on lab yourself, you need to have a free Databricks Community Edition account. Here is the link to register a free Databricks Community account: https://databricks.com/try-databricks. Knowledge of SQL and experience with Python is helpful to understand the lecture.

Lecturer background: Lan Jiang works as the Regional Lead of Resident Solutions Architect at Databricks. Lan has 20 years of industry consulting experience to help Fortune 500 clients in finance, insurance, healthcare, retail, transportation and federal. Prior to joining Databricks, Lan served as a Sr. Solutions Architect at Cloudera, Principal Consultant & Enterprise Architect at Oracle and ran his own Big Data consulting firm for a few years. His experience is enhanced by a MS in Computer Science from University of Illinois Chicago and Executive MBA from Northern Illinois University. Here is a link to his linkedin profile: https://www.linkedin.com/in/lanjiang/

Details

Date:
June 18, 2020
Time:
12:00 pm - 2:00 pm
Website:
https://umich.zoom.us/j/96192257809