Retrosynthesis prediction with an iterative string editing model
Yuqiang Han,
Xiaoyang Xu,
Chang-Yu Hsieh,
Keyan Ding,
Hongxia Xu,
Renjun Xu,
Tingjun Hou (),
Qiang Zhang () and
Huajun Chen ()
Additional contact information
Yuqiang Han: Zhejiang University
Xiaoyang Xu: Zhejiang University
Chang-Yu Hsieh: Zhejiang University
Keyan Ding: Zhejiang University
Hongxia Xu: Zhejiang University
Renjun Xu: Zhejiang University
Tingjun Hou: Zhejiang University
Qiang Zhang: Zhejiang University
Huajun Chen: Zhejiang University
Nature Communications, 2024, vol. 15, issue 1, 1-16
Abstract:
Abstract Retrosynthesis is a crucial task in drug discovery and organic synthesis, where artificial intelligence (AI) is increasingly employed to expedite the process. However, existing approaches employ token-by-token decoding methods to translate target molecule strings into corresponding precursors, exhibiting unsatisfactory performance and limited diversity. As chemical reactions typically induce local molecular changes, reactants and products often overlap significantly. Inspired by this fact, we propose reframing single-step retrosynthesis prediction as a molecular string editing task, iteratively refining target molecule strings to generate precursor compounds. Our proposed approach involves a fragment-based generative editing model that uses explicit sequence editing operations. Additionally, we design an inference module with reposition sampling and sequence augmentation to enhance both prediction accuracy and diversity. Extensive experiments demonstrate that our model generates high-quality and diverse results, achieving superior performance with a promising top-1 accuracy of 60.8% on the standard benchmark dataset USPTO-50 K.
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-024-50617-1 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50617-1
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-024-50617-1
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().