Towards computational design of molecules with desired properties

“Freedom of design” in chemical compound space

Graphical depiction of the rational molecular design process, which involves a “needle-in-a-haystack” search for molecules with a desired set of properties. (Image: Leonardo Medrano Sandonas of the University of Luxembourg; background image by on Freepik)

Using ALCF supercomputing resources, a multi-institutional team has introduced a novel approach, called “freedom of design,” that can help accelerate the discovery of new molecules with targeted properties.

Researchers from University of Luxembourg, Cornell University, and the U.S. Department of Energy’s (DOE) Argonne National Laboratory have introduced a novel “freedom of design” principle in the chemical compound space (CCS) – the unfathomably vast space populated by all possible atomic compositions and their geometries. The team then showed that this principle has important implications for enabling the rational design of molecules with a desired set of properties.

The exploration of the remarkably vast space of molecules and materials with data-driven approaches has inspired countless academic and industrial initiatives to seek out the fundamental relationships that exist between the structural signatures of molecules and their physical and/or chemical properties. While there has been significant progress in this area, a comprehensive understanding of these complex relationships—even in the more manageable sector of CCS spanned by small molecules—was still lacking despite the critical importance and high relevance of such molecules throughout the chemical and pharmaceutical sciences.

“Unraveling complex relationships between molecular structures and properties would not only provide us with the tools needed to explore and characterize the molecular space, but it would also greatly advance our ability to rationally design molecules with targeted array of physicochemical properties”, says Alexandre Tkatchenko, professor of Theoretical Chemical Physics in the Department of Physics and Materials Science at the University of Luxembourg.

Weak correlations enable “freedom of design”

The team's paper, "'Freedom of Design' in Chemical Compound Space: Towards Rational in Silico Design of Molecules with Targeted Quantum-Mechanical Properties," was recently published in the journal Chemical Science. One of their key findings was that most molecular properties are only weakly correlated and therefore effectively independent.

“While one might view this as a challenge in the field of rational molecular design, we demonstrate that this finding highlights an intrinsic flexibility – or ‘freedom of design’ – that exists in the chemical compound space, wherein there are very few limitations which prevent markedly distinct molecules from sharing multiple important properties,” says Robert DiStasio Jr., professor of Theoretical Chemistry at Cornell University.

Searching for optimal pathways in chemical space 

To explore how this intrinsic flexibility will manifest in the molecular design process, which often involves the simultaneous optimization of multiple physicochemical properties, the team used Pareto multi-property optimization to search for molecules with simultaneously large molecular polarizability and electronic gap, a design task of relevance for identifying novel molecules for polymeric batteries. The researchers found paths through chemical space consisting of several unexpected molecules connected by structural and/or compositional changes, reflecting the freedom in the rational design and discovery of molecules with targeted property values.

“A potentially interesting next step would be to use these Pareto-optimal structures in conjunction with powerful machine learning approaches to build reliable multi-objective frameworks for a systematic navigation of hitherto unexplored chemical spaces,” says Tkatchenko.

Implications for the molecular design paradigm

“By demonstrating that 'freedom of design' is a fundamental and emergent property of CCS, our work has a number of important implications in the fields of rational molecular design and computational drug discovery. For one, we hope this work will challenge the chemical sciences community to consider how such intrinsic flexibility can be used to extend the dominant paradigm in the forward molecular design process. We also hope that this work will enable substantive progress towards solving the inverse molecular design problem, in which one seeks to find a molecule (or set of molecules) corresponding to a targeted array of properties,” explains Dr. Leonardo Medrano Sandonas, postdoctoral researcher in the Theoretical Chemical Physics group at the University of Luxembourg.

The combination of the insights gained from this work with advanced machine learning approaches could aid in the development of effective strategies for high-throughput screening of novel molecules tailored to a specific application.

The research team used the high-performance computing resources of the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science user facility.

This article was originally published by the University of Luxembourg.