Biocatalysis from enzymes is an option of choice in the synthesis of molecules in green chemistry, but remains complicated by the difficulty of associating the right enzyme with the right reaction. Researchers at IBM are proposing an artificial intelligence model that identifies the best synthetic route to obtain new molecules.
Using artificial intelligence (AI) to synthesize molecules in a less polluting way, this is the proposal made by a team of researchers from IBM. Their work, published in Nature on February 18, is based on RoboRXN, a machine that rests in the bowels of IBM’s Zürich Lab. Controllable from the cloud, this machine is designed to perform the steps of operations for the retrosynthesis of chemical compounds established by an AI, RXN for Chemistry. Researchers used models to predict synthetic pathways including biocatalyzed reactions – i.e. enzyme use.
RoboRXN is IBM’s “chemical kitchen robot”. From an online platform where the user draws a molecule he wishes to synthesize and the starting precursor, he calculates the simplest “recipe” to obtain the result. This will then engage in the automated laboratory. © Emilie Dedieu
These substances have enormous advantages in the synthesis of molecules. Not only catalyzed reactions from enzymes allow easy recovery of products but above all they can be carried out in water at room temperature, which greatly reduces the production of waste and the use of toxic components. In addition, the enzymes being entirely made up of amino acids are themselves biodegradable and are therefore easy to eliminate at the end of the operation.
Their use for greener chemistry is not new, and is the subject of strong interest in recycling. Over the past twenty years, the use of enzymes such as xylanases has become widespread in the bleaching of kraft pulp and has made it possible to limit that of chlorine or bleach in the production of paper. However, each existing enzyme is designed for a specific chemical reaction and their sheer number – it is estimated that there are around 75,000 in the human body alone – makes it difficult to apply them in many industrial fields.
Learning by transfer
To solve this problem, the new model developed by the research team is trained on a database on enzymatic biocatalysis, which makes it possible to select the right enzyme and the right substrate for the right reaction. Thanks to transfer learning, which made it possible to use a model previously trained on a broader base of chemical reactions before making it work specifically on biocatalysis data, the researchers obtained an accuracy of up to nearly 50% in synthesis and 40% in retrosynthesis.
“The lack of available data to train our model still significantly affects its accuracy, explains Daniel Prost, first author of the study in a blog post. However, a user having access to specific subclasses of enzymatic reactions on which he would like to work could use them to refine our model and increase its predictive power. ”