Learning continues throughout summer for ALCF student interns

Madeleine O’Keefe

Facebook Twitter LinkedIn Google E-mail Printer-friendly version

Every summer, the halls of Theory and Computing Sciences Building at the U.S. Department of Energy’s (DOE) Argonne National Laboratory are a little noisier. Previously empty workspaces are suddenly occupied and the communal fridges become prime real estate for lunch bags of every shape and size.

This is all due to the presence of the more than 150 summer interns in the lab’s Computing, Environment, and Life Sciences directorate—including 45 students (like me!) hired by the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science User Facility. This year, the ALCF interns ranging from undergraduates to Ph.D. candidates came from all over the country to gain hands-on experience with some of the most powerful supercomputers in the world.

From exploring the potential of quantum computing to visualizing system logs of high-performance computing (HPC) systems, this year’s intern class worked alongside ALCF staff mentors to tackle research projects that address issues at the forefront of scientific computing.

“It’s important to us that we bring in these students every year,” says ALCF Director Michael Papka. “Not only do they get to collaborate with experts in their fields to solve real-world R&D problems, but they also get to experience what it’s like to work at a national lab. That’s not something they can get in a classroom.”

Keep reading to learn more about four of these students and their summers at the ALCF.

Accelerating machine learning using near-term quantum computers

As we approach the limits of silicon scaling and Moore’s law—the prediction that the number of transistors that will fit on integrated circuits doubles every year—it’s important that we start researching ways to address these issues. That’s just what Ruslan Shaydulin, a Ph.D. student in computer science from Clemson University, worked on this summer.

Quantum computers are potentially capable of exponential speed-ups over the best-known classical algorithms for certain problems. Near-term noisy intermediate-scale quantum (NISQ) devices, however, are very limited in their ability to realize the quantum advantage.

Shaydulin’s summer project with the ALCF focuses on trying to take advantage of those near-term NISQ-era devices to solve real-world problems. For example, graph modularity clustering is a technique that can be used in applications such as constructing gene co-expression networks, which have the potential to help with the distribution of pharmaceuticals in the body.

To improve the quality of clustering, Shaydulin developed a hybrid algorithm that leveraged a quantum approximate optimization algorithm for solving small subproblems. Classical and quantum parts of the algorithm work in tandem, with the quantum part solving the subproblems that are particularly hard for classical computers.

“It is a cutting-edge science project of great importance for the DOE given the recently announced National Quantum Initiative,” says Yuri Alexeev, an Argonne computational scientist and Shaydulin’s summer mentor. “Ruslan did a tremendously successful job.”

Shaydulin also contributed to grant proposals and a paper submission to a Post Moore’s Era Supercomputing workshop at the upcoming International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC18). He says that working at Argonne has helped him see the benefits of working in a collaborative environment.

“This definitely broadens the scope of my research,” he says. “I’ve been able to try a lot of things here that I wouldn't be able to try if I was just sitting in my office at Clemson, because you get to talk to a lot of people who are really pushing the boundaries of scientific computing.”

Visual and predictive analysis of error logs in HPC systems

This isn’t the first ALCF experience for Shilpika, a Ph.D. student from the University of California Davis. Previously, while working towards her Master’s degree at Loyola University in Chicago, Shilpika used Cooley, the ALCF’s data analysis and visualization cluster, for her research investigating how to make computations with big data run faster. After graduating in 2016, she was a pre-doc at the ALCF for almost a year.

This summer, she is back to help develop an effective, interactive, and intuitive way to better understand the performance and monitoring data from supercomputers.

To maintain robust and reliable HPC systems, it is necessary to understand the various system events and failures occurring on supercomputers. This can be done on supercomputers like Mira and Theta by analyzing system logs such as environment logs, job logs, and error logs.

For a facility like the ALCF, it’s important to understand how these different events correlate and try to identify and predict patterns to improve operational efficiency of the computers.

Shilpika’s summer project involved building a visual analytics tool that would help enhance the understanding of the environmental, job, and error log data. Using pre-processed data from Mira and raw data from Theta, she coded a variety of log visualizations with Python and D3 in JavaScript, from histograms to interactive radial charts.

“Based on the analyses of the various system logs, we can identify patterns on correlations, anomalies, and faults and try to predict these using machine learning, or provide possible solutions for events and faults that have already occurred,” says Shilpika. “We want to know, is there a mechanism that we could identify to predict different events that are occurring?”

This summer work with the ALCF relates to Shilpika’s thesis project at UC Davis where she wants to build a visualization toolkit using machine learning. She expects to collaborate with the ALCF throughout the PhD process.

“I always enjoy working at ALCF,” says Shilpika. “Every year, I learn something new. There is so much to explore.”

Machine learning on a RAM Area Network

Melanie Cornelius, a Master’s student in computer science at the Illinois Institute of Technology, spent her summer at Argonne testing machine learning on the ALCF’s RAM Area Network (RAN) project, which is working to minimize cost and optimize resource utilization for supercomputers.

In current datacenter designs, processors and memory are tightly coupled. But a job’s processing and memory requirements frequently don’t match the proportions chosen for nodes. On Cooley, this means many nodes have significant amounts of underutilized RAM.

To solve this issue, RAN treats RAM as a schedulable, de-coupled resource. Specialized hardware called XPDs are connected to Cooley. They contain a pool of RAM which can be allocated to users, and each node can use its local RAM as a cache for the remote memory.

“Basically, it's a potential way to decrease the cost of building a supercomputer by requiring less per-node RAM while increasing its utilization statistics,” says Cornelius.

Cornelius’s project works toward understanding the performance relationship between properties of machine learning and RAN’s configurable, disaggregated memory. Ultimately, she would like to determine if and how machine learning might be made more performant at very large scales when using disaggregated memory.

While she has some preliminary results, the project is ongoing, and Cornelius is still running tests. She submitted a poster to SC18 and plans on submitting papers for publication later in the year. Cornelius says that this ALCF experience “has completely changed everything” about the direction of her academic and career paths.

“I think every single undergrad in a computer science program that is thinking about going to graduate school should apply for one of these internships,” she says. “They are such a good way to get really practical experience in open-ended CS problems.”

Communicating science at the ALCF

Effectively communicating science is paramount for the advancement of discoveries and the continuation of scientific research. At places like Argonne, where funding comes from the United States government, it is even more important to be able to talk—and write—about science in ways that are clear and accessible to non-scientists (and taxpayers).

Madeleine O’Keefe (that’s me!) addressed this in her internship with the ALCF communications team this summer.

I just finished up my Bachelor’s degree in astronomy at Boston University in May, but I will be returning in the fall for a one-year Master’s program in science journalism. This internship was an amazing opportunity to experience science writing in a new context—a national laboratory.

Unlike the other interns who had specific research problems to solve, I contributed to various outreach projects within ALCF communications, including writing one-page summaries of projects that use ALCF computing time. These “highlights” will be used for the ALCF’s annual Science Report. While it was often challenging to condense nuanced and esoteric science projects into a small space, I enjoyed learning about the huge range of science that is enabled by the facility’s computing resources.

In addition to the highlights, I wrote three feature articles (including this one) for the ALCF website. One was about the CodeGirls@Argonne program, a two-day summer camp geared at immersing middle school girls in computer science. Another, describing the collaborative efforts between CERN’s ATLAS experiment and scientists from the ALCF.

For me, one of the biggest takeaways from this experience is the reinforced idea of the importance of being able to communicate science clearly. Whether at a national lab or an elementary school classroom, whether it’s a 300-word story or 3,000, effective science communication is essential to the future of research and institutions like the ALCF.

2018 ALCF Summer Students

Joseph Adamo, University of Illinois Urbana-Champaign
Mentors: Silvio Rizzi and Joe Insley

Amit Bashyal, Oregon State University
Mentor: Taylor Childers

Bennett Bernardoni, University of Illinois at Urbana-Champaign
Mentors: Silvio Rizzi and Joe Insley

James Bonasera, Northern Illinois University
Mentor: Joe Insley

Xin Cao, Stony Brook University
Mentor: Wei Jiang

Melanie Cornelius, Illinois Institute of Technology
Mentors: Brian Toonen and Lisa Childers

Chinmayi Dhangekar, University of Colorado Boulder
Mentor: Ramesh Balakrishnan

Jose Monsalve Diaz, University of Delaware
Mentor: Kaylan Kumaran

Blake Ehrenbeck, Illinois Institute of Technology
Mentor: Lisa Childers

Yuping Fan, Illinois Institute of Technology
Mentor: Paul Rich

Kavon Farvardin, The University of Chicago
Mentor: Hal Finkel

Samuel Foreman, University of Iowa
Mentor: James Osborn

Kevin Gasperich, University of Pittsburgh
Mentor: Anuoar Benali

Aparna Gollakota, Loyola University Chicago
Mentors: George K. Thiruvathukal, Silvio Rizzi, and Joe Insley

Zhen Huang, Illinois Institute of Technology
Mentor: Lisa Childers

Iris Johnson, Northern Illinois University
Mentor: Hal Finkel

May-Myo Khine, Northern Illinois University
Mentor: Jini Ramprakash

Rob Kondratowicz, Northern Illinois University
Mentor: Joe Insley

Boyang Li, Illinois Institute of Technology
Mentor: Sudheer Chunduri

Chi "Garnett" Liu, Duke University
Mentor: Anouar Benali

Xiaoyang Lu, Illinois Institute of Technology
Mentor: Bill Allcock

Zheng Miao, Clemson University
Mentors: Sudheer Chunduri and Kevin Harms

Madeleine O'Keefe, Boston University
Mentors: Beth Cerny and Jim Collins

Jonathan Paprocki, Georgia Tech
Mentor: Yuri Alexeev

Genki Prayogo, Japan Advanced Institute of Science and Technology
Mentor: Anuoar Benali

Bradley Protano, Northern Illinois University
Mentor: Ye Luo

Vedant Puri, University of Illinois, Urbana-Champaign
Mentor: Ramesh Balakrishnan

Siddhisaket Raskar, University of Delaware
Mentor: Kaylan Kumaran

Jack Salazar, Indiana University
Mentor: Kevin Harms

Aritra Sen, The University of Chicago
Mentor: Elise Jennings

Sergio Servantez, Illinois Institute of Technology
Mentors: Rick Zamora and François Tessier

Ruslan Shaydulin, Clemson University
Mentor: Yuri Alexeev

Shilpika, University of California Davis
Mentor: Venkat Vishwanath

Zhi Shuai, Carnegie Mellon University
Mentor: Wei Jiang

Alex Stocker, Bradley University
Mentor: Joe Insley

Myrline Sylveus, Northern Illinois University
Mentor: Jini Ramprakash

Anish Thakur, Temple University
Mentor: Kevin Harms

Zhongkai Wen, University of Illinois at Chicago
Mentor: Tom Uram

Spencer Williams, Stony Brook University
Mentor: Taylor Childers

Qi Wu, University of Utah
Mentor: Silvio Rizzi and Joe Insley

Xin-Chuan Wu, The University of Chicago
Mentor: Yuri Alexeev

Yamin Xu, Northern Illinois University
Mentor: Joe Insley

Yuliana Zamora, The University of Chicago
Mentor: Venkat Vishwanath

Ning Zhang, Illinois Institute of Technology
Mentor: Bill Allcock

Bumeng Zhuo, The University of Chicago
Mentor: Elise Jennings

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.