Generic and ML Workloads in an HPC Datacenter: Node Energy, Job Failures, and Node-Job Analysis

Authors
Chu, X., D. Hofstätter, S. Ilager, S. Talluri, D. Kampert, D. Podareanu, D. Duplyakin, I. Brandic, and A. Iosup
Publication Date
Name of Publication Source
2024 IEEE 30th International Conference on Parallel and Distributed Systems (ICPADS)
Publisher
IEEE
Conference Location
Belgrade, Serbia
DOI
10.1109/ICPADS63350.2024.00097