Machine Learning with TensorFlow, Horovod, and PyTorch on HPC

Running efficient and scalable deep learning applications on leadership computing systems, including future exascale supercomputers, requires good use of popular deep learning frameworks, such as TensorFlow, Horovod, and PyTorch. In this ESP Webinar, we will cover the basics of when you should use these frameworks, how to build and deploy models on HPC systems, and how to get good performance. Additionally, deep learning workloads on HPC also require care when scaling to multi-node jobs, and HPC systems offer opportunities to perform hyperparameter searches as well. Finally, we will discuss some techniques for profiling deep learning workloads on HPC systems and how to solve bottlenecks.


Corey Adams