We consider the problem of achieving energy efficient sparse scientific computing by characterizing the interactions between application and architecture attributes toward energy and performance trade-offs. Our goal is to improve the energy efficiency while maintaining or improving application performance. Toward this, we investigate energy and performance improvements at three levels of abstraction: (i) global application workload balancing across multiprocessor nodes, (ii) global inter node opportunities for energy improvements, such as: dynamic voltage frequency scaling (DVFS), interconnect link scaling and just in time computation; and (iii) single node on-chip optimizations for power-aware high performance computing, such as: adaptive hardware selection and novel cache architectures for multi-core processors.