As high performance computing systems grow to exascale and beyond, optimizing MPI collective communication is increasingly critical. Collective performance relies heavily on selecting the appropriate algorithm to implement the desired communication. To improve the selection process, Machine learning (ML) autotuners are a promising alternative to the outdated heuristics used by production MPI libraries. However, ML autotuners require significant training time, rendering them impractical on large scale systems. I will present our novel ML autotuner design, which incorporates a custom active learning strategy and other optimizations to minimize training overhead. Additionally, I will describe our recent effort to develop new variable-radix algorithms that outperform the state-of-the-art algorithms and enhance the potency of ML autotuning.
To add to calendar:
Click on: https://wordpress.cels.anl.gov/cels-seminars/
Enter your credentials.
Search for your seminar
Click “Add to calendar”