Optimizing Locality and Parallelism through Program Reorganization

Sriram Krishnamoorthy
Seminar

Achieving efficient execution of programs on modern computers requires present approaches to alleviate some of the burden in managing the data locality and parallelism in the application. A program written in a form that can be blocked into coarser operations is reorganized through a combination of empirical and model-driven optimization during program installation, compilation, and execution. The tools employed vary widely and depend on the application domain of interest. I will present optimization techniques for matrix transposition, automatic parallelization of stencil codes, and load-balanced execution of tensor contraction expressions.