Molecular design with generative artificial intelligence

Author: Gisbert Schneider

Artificial intelligence is set to revolutionize molecular design

Generative artificial intelligence (AI) supports various fields of scientific research and has already been successfully applied to pharmaceutical and chemical molecular design. In fact, molecular design with AI has become a routine task. During model training, these software tools, predominantly deep artificial neural networks, form an internal representation of molecular structures. This allows for the generation (“design”) of novel molecules that possess properties derived from the training data. In other words, these algorithms learn the syntax of the training molecules and, after model training, use the underlying probabilities to construct new chemical structures following the learned chemical syntax. Chemical language models are among the most notable examples of generative AI in drug design. When integrated with biological activity predictions, chemical synthesis, and biological testing, new bioactive molecules can be swiftly identified and optimized through design-make-test-analyze cycles.1 By designing novel molecules rather than searching for existing ones, generative AI helps manage the vast number of theoretically possible molecules and provides fresh ideas to chemists.2 Most of the AI tools developed by scientists for this purpose are freely accessible to the public.

Artificial intelligence facilitates the swift identification of medicinal drug candidates

In pharmaceutical sciences, generative AI is emerging as a cornerstone for medicinal drug discovery, propelled by recent technological advances and the availability of appropriate datasets and software. A particular appeal of certain AI models lies in solving multi-dimensional optimization tasks, specifically designing drug candidates with minimized unwanted side-effects (off-target activities), and efficiently filtering out candidates with a high likelihood of failure or expected liabilities.3 Various computational approaches are now readily accessible. Successful applications span all aspects of drug discovery, from developing new enzyme inhibitors to designing molecules that selectively modulate membrane protein activities and beyond. These innovations are intended as therapeutic agents to combat diseases effectively.

The chemical synthesis of computer-generated compounds remains challenging

While AI models for drug design are rapidly evolving, thanks to many highly active research teams across the globe, chemistry remains a bottleneck for AI-driven drug discovery. The chemical synthesis of these computer-generated molecules is rarely straightforward, and some of the molecules cannot be obtained at all or only after lengthy synthesis efforts. Recently, deep learning models have been developed to accurately predict the chemical reactivity of drug-like compounds, thereby facilitating the optimization of clinical candidates.4 Current developments focus on automating the design of molecules that can be synthesized with minimal effort, ideally in a fully automated laboratory, to expedite the discovery process and minimize errors.1 Successful applications include the design and synthesis of selective modulators for G-protein coupled receptors, kinases, and transcription factors. Additionally, there are efforts to convert complex, pharmaceutically active natural products into small molecule mimetics that are easier to synthesize.

Generative AI designs ligands tailored to specific protein structures

The latest feat of AI in drug discovery is the design of new bioactive molecules by starting from a three-dimensional protein structure.5 For a given target protein, the algorithm generates blueprints for potential drug molecules that can either enhance or inhibit the protein’s activity. All the algorithm requires is the surface structure of a protein. Based on this information, it designs molecules (ligands) that bind specifically to the protein. Without human intervention, this generative AI can develop potential drug molecules. To develop this algorithm, an AI model was trained using data from hundreds of thousands of known interactions between synthetic ligand molecules and their corresponding protein structures, the so-called “interactome”. This process ensures from the outset, as much as possible, that the generated molecules can be chemically synthesized. Additionally, the algorithm preferably suggests molecules that interact specifically with the target protein at the desired site, with minimal interaction with other proteins. A first application was the design of potential new diabetes drugs by activating a certain transcription factor. The AI immediately designed novel molecules that, like currently available drugs, increase the activity of the transcription factor, but without the lengthy discovery process.

Chemistry-savvy AI technology can be misused

These selected examples highlight the efficiency and versatility of generative AI in drug discovery. Beyond pharmaceutical applications, generative “chemical AI” extends to discovering new materials, catalysts, and macromolecular structures, promising significant advancements in these fields. In the coming years, we can expect numerous successful applications that will enhance the design of innovative chemical compounds.

However, the most critical aspect of molecular design lies in selecting desired candidates. In medicinal drug discovery, the aim is to identify “good” molecules while early eliminating “bad” ones. If selection criteria are inverted, or if potentially harmful or cytotoxic molecule structures are used in model development, the AI may generate new molecules with these undesirable properties. Therefore, can these models also be used to deliberately develop new toxic or deadly substances? The answer is unequivocally “yes”. The capability of generative AI to create novel molecules extends beyond beneficial applications like drug discovery to include the generation of substances with harmful properties, depending on the criteria and constraints set during their development.6 These models can potentially be used to develop substances that pose a danger to life on our planet, including chemical and biological weapons. Importantly, this outcome is not attributed to the works of a “killer AI”, but rather hinges entirely on the constraints set by the user.

Synthetic biology is poised to become the next major frontier for AI in the life sciences. The potential benefits and risks associated with AI-generated synthetic biosystems and whole organisms are significantly amplified compared to small molecule drugs. We are entering an incredibly exciting yet potentially perilous era in applied AI within the life sciences. This shift goes beyond ethical challenges associated with intervening in living organisms; it involves the creation and modification of entirely new life forms using advanced computational models. Therefore, while generative AI offers tremendous potential for innovation, it also necessitates careful consideration and ethical oversight to ensure responsible use. Efforts led solely by scientists may be insufficient to prevent improper use of this technology. At the same time, government regulation of AI in research could hinder the development of urgently needed medicinal drugs and materials that AI-designed molecules could provide.7 It is the joint responsibility of the individual user, the scientific community, and the authorities to prevent misuse at all costs.

1Schneider, G. (2018) Automating drug discovery. Nature Reviews Drug Discovery 17, 97–113.

2Schneider, G. (2019) Mind and machine in drug design. Nature Machine Intelligence 1, 128–130.

3Allenspach, S., Hiss, J. A. & Schneider, G. (2024) Neural multi-task learning in drug design. Nature Machine Intelligence 6, 124–137.

4Nippa, D. F., Atz, K., Hohler, R., Müller, A. T., Marx, A., Bartelmus, C., Wuitschik, G., Marzuoli, I., Jost, V., Wolfard, J., Binder, M., Stepan, A. F., Konrad, D. B., Grether, U., Martin, R. E. & Schneider, G. (2024) Enabling late-​stage drug diversification by high-​throughput experimentation with geometric deep learning. Nature Chemistry 16, 239–248.

5Atz, K., Cotos Muñoz, L., Isert, C., Håkansson, M., Focht, D., Hilleke, M., Nippa, D. F., Iff, M., Ledergerber, J., Schiebroek, C. C. G., Romeo, V., Hiss, J. A., Merk, D., Schneider, P., Kuhn, B., Grether, U. & Schneider, G. (2024) Prospective deep interactome learning for de novo drug design. Nature Communications 15, 3408.

6Urbina, F., Lentzos, F., Invernizzi, C. & Ekins, S. (2022) Dual use of artificial intelligence-powered drug discovery. Nature Machine Intelligence 4, 189–191.

7Callaway, E. (2024) Could AI-designed proteins be weaponized? Scientists lay out safety guidelines. Nature 627, 478.

Skip to content