EconPapers    
Economics at your fingertips  
 

The METLIN small molecule dataset for machine learning-based retention time prediction

Xavier Domingo-Almenara (), Carlos Guijas, Elizabeth Billings, J. Rafael Montenegro-Burke, Winnie Uritboonthai, Aries E. Aisporna, Emily Chen, H. Paul Benton and Gary Siuzdak ()
Additional contact information
Xavier Domingo-Almenara: The Scripps Research Institute
Carlos Guijas: The Scripps Research Institute
Elizabeth Billings: The Scripps Research Institute
J. Rafael Montenegro-Burke: The Scripps Research Institute
Winnie Uritboonthai: The Scripps Research Institute
Aries E. Aisporna: The Scripps Research Institute
Emily Chen: The Scripps Research Institute
H. Paul Benton: The Scripps Research Institute
Gary Siuzdak: The Scripps Research Institute

Nature Communications, 2019, vol. 10, issue 1, 1-9

Abstract: Abstract Machine learning has been extensively applied in small molecule analysis to predict a wide range of molecular properties and processes including mass spectrometry fragmentation or chromatographic retention time. However, current approaches for retention time prediction lack sufficient accuracy due to limited available experimental data. Here we introduce the METLIN small molecule retention time (SMRT) dataset, an experimentally acquired reverse-phase chromatography retention time dataset covering up to 80,038 small molecules. To demonstrate the utility of this dataset, we deployed a deep learning model for retention time prediction applied to small molecule annotation. Results showed that in 70$$\%$$% of the cases, the correct molecular identity was ranked among the top 3 candidates based on their predicted retention time. We anticipate that this dataset will enable the community to apply machine learning or first principles strategies to generate better models for retention time prediction.

Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
https://www.nature.com/articles/s41467-019-13680-7 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-13680-7

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-019-13680-7

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:10:y:2019:i:1:d:10.1038_s41467-019-13680-7