SARST2 high-throughput and resource-efficient protein structure alignment against massive databases
Wei-Cheng Lo (),
Arieh Warshel,
Chia-Hua Lo,
Chia Yee Choke,
Yan-Jie Li,
Shih-Chung Yen,
Jyun-Yi Yang and
Shih-Wen Weng
Additional contact information
Wei-Cheng Lo: National Yang Ming Chiao Tung University
Arieh Warshel: University of Southern California
Chia-Hua Lo: National Yang Ming Chiao Tung University
Chia Yee Choke: National Yang Ming Chiao Tung University
Yan-Jie Li: National Yang Ming Chiao Tung University
Shih-Chung Yen: National Yang Ming Chiao Tung University
Jyun-Yi Yang: National Yang Ming Chiao Tung University
Shih-Wen Weng: National Yang Ming Chiao Tung University
Nature Communications, 2025, vol. 16, issue 1, 1-15
Abstract:
Abstract The flood of protein structural Big Data is coming. With the belief that biotech researchers deserve powerful analysis engines to overcome the challenge of rapidly increasing computational demands, we are devoted to developing efficient protein structural alignment search algorithms to assist researchers as they push the frontiers of biological sciences and technology. Here, we present SARST2, an algorithm that integrates primary, secondary, and tertiary structural features with evolutionary statistics to perform accurate and rapid alignments. In large-scale benchmarks, SARST2 outperforms state-of-the-art methods in accuracy, while completing AlphaFold Database searches significantly faster and with substantially less memory than BLAST and Foldseek. It employs a filter-and-refine strategy enhanced by machine learning, a diagonal shortcut for word-matching, a weighted contact number-based scoring scheme, and a variable gap penalty based on substitution entropy. SARST2, implemented in Golang as standalone programs available at https://10lab.ceb.nycu.edu.tw/sarst2 and https://github.com/NYCU-10lab/sarst , enables massive database searches using even ordinary personal computers.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.nature.com/articles/s41467-025-63757-9 Abstract (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-63757-9
Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/
DOI: 10.1038/s41467-025-63757-9
Access Statistics for this article
Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie
More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().