EconPapers    
Economics at your fingertips  
 

Wrangling distributed computing for high-throughput environmental science: An introduction to HTCondor

Richard A Erickson, Michael N Fienen, S Grace McCalla, Emily L Weiser, Melvin L Bower, Jonathan M Knudson and Greg Thain

PLOS Computational Biology, 2018, vol. 14, issue 10, 1-8

Abstract: Biologists and environmental scientists now routinely solve computational problems that were unimaginable a generation ago. Examples include processing geospatial data, analyzing -omics data, and running large-scale simulations. Conventional desktop computing cannot handle these tasks when they are large, and high-performance computing is not always available nor the most appropriate solution for all computationally intense problems. High-throughput computing (HTC) is one method for handling computationally intense research. In contrast to high-performance computing, which uses a single "supercomputer," HTC can distribute tasks over many computers (e.g., idle desktop computers, dedicated servers, or cloud-based resources). HTC facilities exist at many academic and government institutes and are relatively easy to create from commodity hardware. Additionally, consortia such as Open Science Grid facilitate HTC, and commercial entities sell cloud-based solutions for researchers who lack HTC at their institution. We provide an introduction to HTC for biologists and environmental scientists. Our examples from biology and the environmental sciences use HTCondor, an open source HTC system.Author summary: Computational biology often requires processing large amounts of data, running many simulations, or other computationally intensive tasks. In this hybrid primer/tutorial, we describe how high-throughput computing (HTC) can be used to solve these problems. First, we present an overview of high-throughput computing. Second, we describe how to break jobs down so that they can run with HTC. Third, we describe how to use HTCondor software as a method for HTC. Fourth, we describe how HTCondor may be applied to other situations and a series of online tutorials.

Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006468 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 06468&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1006468

DOI: 10.1371/journal.pcbi.1006468

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1006468