Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library
Qiong Yang,
Hongchao Ji,
Zhenbo Xu,
Yiming Li,
Pingshan Wang,
Jinyu Sun,
Xiaqiong Fan,
Hailiang Zhang,
Hongmei Lu () and
Zhimin Zhang ()
Additional contact information
Qiong Yang: Central South University
Hongchao Ji: Chinese Academy of Agricultural Sciences
Zhenbo Xu: Central South University
Yiming Li: Central South University
Pingshan Wang: Central South University
Jinyu Sun: Central South University
Xiaqiong Fan: Central South University
Hailiang Zhang: Central South University
Hongmei Lu: Central South University
Zhimin Zhang: Central South University
Nature Communications, 2023, vol. 14, issue 1, 1-11
Abstract:
Abstract Spectrum matching is the most common method for compound identification in mass spectrometry (MS). However, some challenges limit its efficiency, including the coverage of spectral libraries, the accuracy, and the speed of matching. In this study, a million-scale in-silico EI-MS library is established. Furthermore, an ultra-fast and accurate spectrum matching (FastEI) method is proposed to substantially improve accuracy using Word2vec spectral embedding and boost the speed using the hierarchical navigable small-world graph (HNSW). It achieves 80.4% recall@10 accuracy (88.3% with 5 Da mass filter) with a speedup of two orders of magnitude compared with the weighted cosine similarity method (WCS). When FastEI is applied to identify the molecules beyond NIST 2017 library, it achieves 50% recall@1 accuracy. FastEI is packaged as a standalone and user-friendly software for common users with limited computational backgrounds. Overall, FastEI combined with a million-scale in-silico library facilitates compound identification as an accurate and ultra-fast tool.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-023-39279-7 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-39279-7
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-023-39279-7
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().