Argonne researchers created an automated pipeline that streams data from APS beamlines to ALCF supercomputers for near real-time analysis during live experiments.
By integrating DOE supercomputers, including ALCF’s Polaris, with the upgraded Advanced Photon Source, Argonne researchers are enabling rapid data analysis to help guide experiments as they unfold.
In a major step toward accelerating scientific discovery, the U.S. Department of Energy’s (DOE) Argonne National Laboratory is working to tightly integrate its newly upgraded X-ray facility, the Advanced Photon Source (APS), with some of the nation’s most powerful supercomputers. The effort begins with Polaris, located at Argonne’s Leadership Computing Facility (ALCF). Both the APS and ALCF are DOE Office of Science user facilities that serve many thousands of researchers each year.
Polaris is joined by two other DOE-supported supercomputers — Perlmutter at Lawrence Berkeley National Laboratory and Frontier at Oak Ridge National Laboratory — in a project to develop automated data workflows for next-generation X-ray science. These workflows are essential to keeping pace with the flood of data produced by the DOE user facilities and other experimental and observational sites.
This project supports a broader DOE vision known as Integrated Research Infrastructure (IRI). It seeks to unify the nation’s research tools, infrastructure and user facilities seamlessly and securely in novel ways to accelerate discoveries.
The APS is an ideal proving ground for this approach. For more than 30 years, the facility has enabled scientists from around the world to explore the structure and behavior of materials at the molecular and atomic levels. A recent upgrade increased the brightness of its X-ray beams by up to 500 times. As new and improved beamlines and instruments come online, researchers will be able to study matter in unprecedented detail — while also generating far more data. Over the next decade, the APS is expected to produce up to 100 times more data than before. To make the most of this capability, that data must be captured, processed, and analyzed in real time to guide experiments as they unfold.
“You can’t tell a material to stop cracking or a cell to stop dividing until the data are inspected and understood afterwards,” said Argonne group leader Nicholas Schwarz. “We need to capture quickly evolving phenomena and adjust the experiment in real time — not hours later.” Other team members include Hannah Parraga, software engineering specialist; Ryan Chard, computer scientist; and Thomas Uram, data services and workflows team lead at the ALCF.
At Argonne, Uram is leading efforts in which ALCF is deploying a suite of integrated tools aimed at meeting the needs of the whole gamut of experimental facilities, providing a comprehensive IRI approach called Nexus. “This comprehensive IRI framework and related services has already enabled seamless access to ALCF supercomputers for many APS beamlines,” he said.
One of the key technologies supporting Nexus is Globus, which provides secure, automated data transfer and distributed computing across sites. For example, Globus supports X-ray photon correlation spectroscopy (XPCS) experiments at APS. The technique uses scattered X-rays to track how materials change at the nanoscale over time. Globus automates the flow of XPCS data between beamlines and supercomputers like Polaris to ensure results are delivered in near-real time.
For the broader APS integration effort, the team has been using Polaris and Globus tools for real-time processing of data from many beamlines. This helped test what works best in live experiments. The team was then able to refine computing infrastructure, software and data workflows for greater speed and reliability. The lessons learned will help expand these approaches to Frontier and Perlmutter.
The result is fast — and even fully automated — data analysis during the data-intensive experiments run at APS. Artificial intelligence will help make sense of vast, complex datasets on the fly, spotting patterns or anomalies on scales that are beyond the capabilities of human researchers.
The improved ability to conduct cutting-edge research will benefit APS users in many fields: materials science; biological and life sciences; physics; chemistry; and environmental, geophysical and planetary sciences. The ultimate goal is to realize a seamless, integrated infrastructure by coupling experimental facilities, including the APS, with DOE’s supercomputers.
This work is part of a broader effort at Argonne to create “smart instruments.” These tools integrate high performance computing and artificial intelligence directly into the scientific process. While Polaris is already enabling new frontiers in data-driven science, it also serves as a proving ground for Aurora, Argonne’s exascale supercomputer, capable of over a billion billion calculations per second — offering even greater potential to unlock new and challenging science.
By building a fast, intelligent data pipeline between the APS and DOE-supported supercomputers, Schwarz, Uram, and the rest of the team are helping to launch a new era of integrated, large-scale scientific discovery. In this future, experiments and analysis will happen side by side to gain new scientific insights in real time at the APS and many other experimental facilities.
This work is supported by DOE’s Advanced Scientific Computing Research (ASCR) program. The team was awarded computing time on DOE supercomputers by the ASCR Leadership Computing Challenge.
The ALCF-APS integration was also highlighted in a recent ALCF webinar exploring experiment-time computing at the APS. A recording of the session, part of ALCF’s Service-Enabled Science series, is available for readers interested in learning more about the tools and workflows behind the integration.