Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science
Daniel Nüst,
Vanessa Sochat,
Ben Marwick,
Stephen Eglen,
Tim Head,
Tony Hirst and
Benjamin Evans
Additional contact information
Daniel Nüst: University of Münster
No fsd7t, OSF Preprints from Center for Open Science
Abstract:
Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, the main drivers for using these containers are transparency and support of reproducibility; in turn, a workflow’s reproducibility can be greatly affected by the choices that are made with respect to building containers. In many cases, the build process for the container’s image is created from instructions provided in a Dockerfile format. In support of this approach, we present a set of rules to help researchers write understandable Dockerfiles for typical data science workflows. By following the rules in this article, researchers can create containers suitable for sharing with fellow scientists, for including in scholarly communication such as education or scientific papers, and for effective and sustainable personal workflows.
Date: 2020-04-17
New Economics Papers: this item is included in nep-cmp
References: Add references at CitEc
Citations:
Downloads: (external link)
https://osf.io/download/5e9964fbf135350557d5a298/
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:osf:osfxxx:fsd7t
DOI: 10.31219/osf.io/fsd7t
Access Statistics for this paper
More papers in OSF Preprints from Center for Open Science
Bibliographic data for series maintained by OSF ().