Centimani: Enabling Fast AI Accelerator Selection for DNN Training with a Novel Performance Predictor Publications USENIX ATC'24: Proceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference
Toward a Holistic Performance Evaluation of Large Language Models Across Diverse AI Accelerators Publications 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
V2684603: Interface Resolved Simulation of Two-Phase Flow Within a 360° Steam Separator Geometry Publications 77th Annual Meeting of the APS Division of Fluid Dynamics
Efficient Distributed Continual Learning for Steering Experiments in Real-Time Publications Future Generation Computer Systems
Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers Publications 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Bricks: A High-Performance Portability Layer for Computations on Block-Structured Grids Publications The International Journal of High Performance Computing Applications
High-Performance, Scalable Geometric Multigrid via Fine-Grain Data Blocking for GPUs Publications SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
Direct Numerical Simulation of Involute Channel Turbulence Publications Journal of Fluids Engineering
MechBERT: Language Models for Extracting Chemical and Property Relationships about Mechanical Stress and Strain Publications Journal of Chemical Information and Modeling