Introduction to AI training on ThetaGPU

Taylor Childers, Argonne National Laboratory
Venkat Vishwanath, Argonne National Laboratory
Huihuo Zheng, Argonne National Laboratory
Kyle Felker, Argonne National Laboratory
Misha Salim, Argonne National Laboratory
Sam Foreman, Argonne National Laboratory
Bethany Lusch, Argonne National Laboratory
Webinar Beginner
Intro to AI

This module will cover the following topics: containers, software environment, MIG Mode, notebooks, job submission etc. At the end of this module trainees will have acquired the basic skills to develop AI models with the ThetaGPU supercomputer.

Day and Time: October 28, 3-5 p.m. US CT.

This session is a part of the ALCF AI for Science Training Series

About the Speakers

Venkatram Vishwanath is a computer scientist at Argonne National Laboratory. He is the Data Science Team Lead at the Argonne leadership computing facility (ALCF). His current focus is on algorithms, system software, and workflows to facilitate data-centric applications on supercomputing systems. His interests include scientific applications, supercomputing architectures, parallel algorithms and runtimes, scalable analytics and collaborative workspaces. He has received best papers awards at venues including HPDC and LDAV, and a Gordon Bell finalist. Vishwanath received his Ph.D. in computer science from the University of Illinois at Chicago in 2009.

Taylor Childers has a PhD in Physics from Univ. of Minnesota. He worked at the CERN laboratory in Geneva, Switzerland for six years as a member of the ATLAS experiment and a co-author of the Higgs Boson discovery paper in July 2012. He has worked in physics analysis, workflows, and simulation from scaling on DOE supercomputers to fast custom electronics (ASIC/FPGA). He applies deep learning to science domain problems including the use of Graph Neural Networks to perform semantic segmentation to associate each the 100 millions pixels of the ATLAS detector to particles originating from the proton collisions. He is currently working with scientists from different domains to apply deep learning to their datasets and take advantage of Exascale supercomputers that will be arriving in the next few years. 

Huihuo Zheng is a computer scientist at the Argonne Leadership Computing Facility. His areas of interest are first-principles simulations of condensed matter systems, excited state properties of materials, strongly correlated electronic systems, and high-performance computing.

Kyle Felkeris an assistant computational scientist at Argonne National Laboratory.

Misha Salim leads development of the Balsam workflow system at ALCF and is a core developer of the DeepHyper framework for tuning deep learning models at scale.  He received a Ph.D. in physical chemistry from the University of Illinois at Urbana-Champaign, where he worked with So Hirata on first-principles simulations of liquid water and ice.  As a postdoctoral appointee, he works with Balsam users to build data-intensive workflows in projects spanning high energy physics, chemistry and materials science, global optimization, machine learning, and experimental data analysis.  Misha’s engineering efforts on the Balsam project are centered on distributed-system workflows and exposing HPC resources to experimental data producers.  His research interests lie at the intersection of computational chemistry and machine learning, particularly where surrogate models can be trained to accelerate costly ab initio simulations.

Sam Foreman is a computational scientist with a background in high energy physics, currently working as a postdoc in the ALCF. He is generally interested in the application of machine learning to computational problems in physics, particularly within the context of high performance computing. Sam's current research focuses on using deep generative modeling to help build better sampling algorithms for simulations in lattice gauge theory.

Bethany Luschis a computer scientist at Argonne National Laboratory.