EconPapers    
Economics at your fingertips  
 

Progressive Cactus is a multiple-genome aligner for the thousand-genome era

Joel Armstrong, Glenn Hickey, Mark Diekhans, Ian T. Fiddes, Adam M. Novak, Alden Deran, Qi Fang, Duo Xie, Shaohong Feng, Josefin Stiller, Diane Genereux, Jeremy Johnson, Voichita Dana Marinescu, Jessica Alföldi, Robert S. Harris, Kerstin Lindblad-Toh, David Haussler, Elinor Karlsson, Erich D. Jarvis, Guojie Zhang () and Benedict Paten ()
Additional contact information
Joel Armstrong: UC Santa Cruz Genomics Institute, UC Santa Cruz
Glenn Hickey: UC Santa Cruz Genomics Institute, UC Santa Cruz
Mark Diekhans: UC Santa Cruz Genomics Institute, UC Santa Cruz
Ian T. Fiddes: UC Santa Cruz Genomics Institute, UC Santa Cruz
Adam M. Novak: UC Santa Cruz Genomics Institute, UC Santa Cruz
Alden Deran: UC Santa Cruz Genomics Institute, UC Santa Cruz
Qi Fang: BGI-Shenzhen, Beishan Industrial Zone
Duo Xie: BGI-Shenzhen, Beishan Industrial Zone
Shaohong Feng: BGI-Shenzhen, Beishan Industrial Zone
Josefin Stiller: University of Copenhagen
Diane Genereux: Broad Institute of Harvard and Massachusetts Institute of Technology (MIT)
Jeremy Johnson: Broad Institute of Harvard and Massachusetts Institute of Technology (MIT)
Voichita Dana Marinescu: Uppsala University
Jessica Alföldi: Broad Institute of Harvard and Massachusetts Institute of Technology (MIT)
Robert S. Harris: The Pennsylvania State University
Kerstin Lindblad-Toh: Broad Institute of Harvard and Massachusetts Institute of Technology (MIT)
David Haussler: UC Santa Cruz Genomics Institute, UC Santa Cruz
Elinor Karlsson: Broad Institute of Harvard and Massachusetts Institute of Technology (MIT)
Erich D. Jarvis: Howard Hughes Medical Institute
Guojie Zhang: University of Copenhagen
Benedict Paten: UC Santa Cruz Genomics Institute, UC Santa Cruz

Nature, 2020, vol. 587, issue 7833, 246-251

Abstract: Abstract New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1–3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.

Date: 2020
References: Add references at CitEc
Citations: View citations in EconPapers (10)

Downloads: (external link)
https://www.nature.com/articles/s41586-020-2871-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:nature:v:587:y:2020:i:7833:d:10.1038_s41586-020-2871-y

Ordering information: This journal article can be ordered from
https://www.nature.com/

DOI: 10.1038/s41586-020-2871-y

Access Statistics for this article

Nature is currently edited by Magdalena Skipper

More articles in Nature from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:nature:v:587:y:2020:i:7833:d:10.1038_s41586-020-2871-y