Build and Test for Distributed Computing Applications

Charles Bacon
Seminar

Scientific codes are used on a wide range of hardware architectures and operating systems. The developers and users of these codes need to be sure that the code performs correctly on all of the systems where it is used. The scale of this task can be quite large, especially when you consider that various revisions of the same operating system are best treated separately. To assist in the testing effort, the National Science Foundation (NSF) funds a build and test pool consisting of the hardware and software used by the NSF community. This saves each development group from having to maintain their own build and test systems. In addition, the NSF funds the development of Metronome, a continuous-integration build and test system. Automating the use of the build systems lowers the cost of development by discovering bugs when they are committed.

This talk will cover the design principles of Metronome, as well as practical experience from using it on a large distributed computing application. It will also compare and contrast the use of Metronome with BuildBot, a popular open-source build tool.