organize an event at Skoltech
learn more
campus tours for universities
book your tour
In a joint project between the Zelinsky Institute of Organic Chemistry (Russian Academy of Sciences) and Skoltech, a research group led by RAS Academician Valentin Ananikov has developed a unique machine learning-based search engine for analyzing vast amounts of high-resolution mass spectrometry data. Machine learning allows exploring terabytes of accumulated data without new experiments. The algorithm accelerates the search for new compounds, reduces costs, and makes research more environmentally friendly. The article was published in the Nature Communications journal.
In a typical laboratory, terabytes of data accumulate over several years, for example, during experimental measurements of high-resolution mass spectrometry. But due to the limitations of manual analysis, scientists consider only a small part of the information. Up to 95% of the accumulated data remains unexplored, which leads to the loss of potentially important discoveries. It would take hundreds of years to manually process such a large amount of information, but new AI-based algorithms can conduct the analysis in just a few days.
“Our work is based on an innovative algorithm combining machine learning and analysis of signal distribution in mass spectra, which has significantly reduced false positives when identifying chemical compounds. The new search algorithm has successfully verified historical data on the Mizoroki-Heck reaction and revealed not only already known, but also completely new chemical transformations, including a unique process of cross-combination that has not been previously documented in the scientific literature,” commented Valentin Ananikov, the scientific supervisor of the study.
During organic synthesis, chemists select specific experimental conditions to optimize the reaction and achieve maximum results. After the reaction and sample preparation, the chemical composition is determined and characterized by an analytical system. High-resolution mass spectrometry is often used to implement this strategy due to its high speed of analysis, sensitivity, and easy data accumulation. The method is widely used in analytical chemistry, organic and inorganic chemistry, proteomics, metabolomics, materials science, as well as in many other fields.
The new solution opens up new possibilities in chemical research. The search engine is capable of analyzing data from different fields of chemistry, leading to the discovery of new reactions, catalysts, and mechanisms. The use of existing data not only accelerates scientific progress, but also reduces experiment cost, making science more environmentally friendly.
The study was carried out at the Zelinsky Institute of Organic Chemistry of the Russian Academy of Sciences and at the Skoltech Energy Center.