Quick Start: Using Apache Spark for Large-Scale Data Processing

Help Desk

Slides
ALCF Dev Session

This is an interactive webinar focused on using Apache Spark, a framework for parallel data processing, on ALCF computing resources. The webinar will present a brief tutorial on Apache Spark, provide instructions for running the framework on ALCF systems, discuss the unique characteristics of Theta, and recommend a few tuning parameters to achieve optimal performance.

Presenter
Xiao-Yong Jin, Argonne National Laboratory