Elucidation of Molecular Substructures from Nuclear Magnetic Resonance Spectra Using Gradient Boosting

Berman, J., Aperstein, Y., & Yosipof, A. (2024). Elucidation of Molecular Substructures from Nuclear Magnetic Resonance Spectra Using Gradient Boosting. In: Artificial Neural Networks and Machine Learning – ICANN 2024. Eds. Wand, M., Malinovská, K., Schmidhuber, J., & Tetko, I.V., pp. 31-42 Springer. 10.1007/978-3-031-72359-9_3.

Full text not available from this repository.

Abstract

Elucidating molecular structures from nuclear magnetic resonance (NMR) spectra poses a complex problem in the field of cheminformatics, namely the inverse problem. Typically, the QSAR process uses the molecule structure features to produce a predictive model for the molecular activity. In case of the inverse problem, features derived from the molecular activity are used to produce a predictive model for the molecular structure. This work demonstrates the use of NMR derived features to elucidate molecular configuration by learning the intricate structure-spectrum relationships. We proposed a machine learning approach using gradient boosting to predict structural motifs directly from NMR data. A dataset of 6,356 compounds that includes both 1H and 13C spectra was collected from NMRShiftDB2 database. The dataset was pre-processed into matrices capturing chemical shifts, multiplicities, and peak intensities. XGBoost classifiers were trained to correlate these spectroscopic signature matrices with molecular substructures represented as MACCS keys. We evaluated the model performance on the full dataset and on constrained chemical space subset. The results indicated that the model’s capacity to associate spectral features with functional groups and other structural elements enables inference of molecular motifs solely from NMR inputs. The proposed model can help automate structure elucidation from NMR spectra and accelerate chemical discovery.

Item Type: Book Section
Research Programs: Advancing Systems Analysis (ASA)
Advancing Systems Analysis (ASA) > Cooperation and Transformative Governance (CAT)
Depositing User: Luke Kirwan
Date Deposited: 18 Sep 2024 14:41
Last Modified: 18 Sep 2024 14:41
URI: https://pure.iiasa.ac.at/19991

Actions (login required)

View Item View Item