Quick Start: Using Apache Spark for Large-Scale Data Processing

Help Desk

Hours: 9:00am-5:00pm CT M-F
Telephone: 866-508-9181 (Toll-Free, US Only) or 630-252-3111
Email: support@alcf.anl.gov

Slides

This is an interactive webinar focused on using Apache Spark, a framework for parallel data processing, on ALCF computing resources. The webinar will present a brief tutorial on Apache Spark, provide instructions for running the framework on ALCF systems, discuss the unique characteristics of Theta, and recommend a few tuning parameters to achieve optimal performance.

Presenter
Xiao-Yong Jin, Argonne National Laboratory