To align with emerging research needs and leverage new computing capabilities, the Argonne Leadership Computing Facility (ALCF), a U.S. Department of Energy (DOE) Office of Science User Facility, is expanding its scope beyond traditional simulation-based research to include data and learning approaches.
As part of this paradigm shift, the facility is working to prepare researchers to use simulation, data science, and machine learning techniques on ALCF computing resources to accelerate their efforts to solve challenging problems in science and engineering.
From February 27 to March 1, 2018, the ALCF welcomed more than 50 prospective and current users to the facility for its Simulation, Data, and Learning Workshop. This training event provided an opportunity for attendees to work directly with ALCF staff members and invited experts from Intel, Cray, and Arm, to learn about the systems, tools, frameworks, and techniques that can help advance research in these three areas of scientific computing.
“At the ALCF, we have both the leadership computing systems and the expertise to apply scalable data and learning methods to enable data-driven discoveries across all scientific disciplines,” said Venkat Vishwanath, ALCF Data Science Group Lead. “This workshop was designed to familiarize researchers with our resources and how they can be used to improve productivity, and potentially pursue new research directions that may not have been possible in the past.”
With architectural features and tools that support data-centric workloads, Theta, the ALCF’s new Intel-Cray system, is particularly well suited for research involving data science and machine learning methods. Each of the system’s 4,392 nodes is equipped with a 64-core Intel processor that has 16 gigabytes of high-bandwidth in-package memory, 192 gigabytes of DDR4 RAM, and 128 gigabytes of node local storage. Theta also supports a variety of scalable frameworks, such as TensorFlow for deep learning and Spark for big data analytics, to help researchers explore and make sense of increasingly large and complex datasets.
Marc Edgar and Brian Barr from GE Global Research attended the workshop to help jump-start their work on Theta for a new project in the ALCF Data Science Program (ADSP). The project, led by principal investigator Rathakrishnan Bhaskaran of GE, aims to leverage machine learning and datasets generated by large-eddy simulations (LES) to develop data-driven turbulence models with improved predictive accuracy.
“Our LES code is going to give us a massive amount of data,” Barr said. “Instead of waiting a month to do the data analysis, we’re looking at only a couple of hours using Theta.”
The GE researchers found it helpful to learn about Theta’s architecture, as well as best practices, tools, and techniques for effectively using the system.
“The ability to scale up for Theta has really opened our eyes to what’s possible. A resource of this size enables a completely different mindset,” Edgar said.
The three-day workshop included talks on topics ranging from executing workflows and using containers on Theta to profiling application performance. Attendees also benefitted from dedicated hands-on sessions, which allowed them to work with ALCF and vendor staff on specific issues related to their applications. With exclusive full-system reservations on Theta, the workshop participants were able to test and debug their codes in real time.
“Getting face time with staff members from the ALCF, Intel, and Cray was invaluable to getting our application running smoothly on Theta,” said Bill Shipman, Director of Cloud Computing at Cloud Pharmaceuticals, a North Carolina-based drug design and development company.
With an ALCF Director’s Discretionary project, Shipman is deploying Cloud Pharmaceuticals Quantum Molecular Design architecture on Theta to design novel small molecules aimed at improving the treatment of amyotrophic lateral sclerosis (ALS). This is more of a simulation-based project, but the workshop’s introduction to resources like deep learning frameworks also piqued Shipman's interest.
“Hearing about the breadth of tools available at the ALCF and the range of projects currently being run here has given us some novel directions to explore for our research and potential collaborations,” he said.
In addition to providing guidance on using ALCF systems and tools, the workshop sought to prepare attendees for future allocation awards through programs like DOE’s Innovative and Novel Computational Impact on Theory and Experiment (INCITE) and the ALCF’s ADSP and the Aurora Early Science Program, which is now seeking proposals for data and learning projects.
From May 15–17, 2018, the ALCF will hold another user training event—the ALCF Computational Performance Workshop—to help researchers achieve computational readiness on the facility’s supercomputers. For more information or to register, visit: https://www.alcf.anl.gov/workshops/performance-workshop18
For more ALCF training opportunities, visit: https://www.alcf.anl.gov/training
Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation's first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America's scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science.
The U.S. Department of Energy's Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.