EconPapers    
Economics at your fingertips  
 

Preprocessing of Public RNA-Sequencing Datasets to Facilitate Downstream Analyses of Human Diseases

Naomi Rapier-Sharman, John Krapohl, Ethan J. Beausoleil, Kennedy T. L. Gifford, Benjamin R. Hinatsu, Curtis S. Hoffmann, Makayla Komer, Tiana M. Scott and Brett E. Pickett
Additional contact information
Naomi Rapier-Sharman: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
John Krapohl: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Ethan J. Beausoleil: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Kennedy T. L. Gifford: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Benjamin R. Hinatsu: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Curtis S. Hoffmann: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Makayla Komer: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Tiana M. Scott: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA
Brett E. Pickett: Department of Microbiology and Molecular Biology, Brigham Young University, Provo, UT 84602, USA

Data, 2021, vol. 6, issue 7, 1-10

Abstract: Publicly available RNA-sequencing (RNA-seq) data are a rich resource for elucidating the mechanisms of human disease; however, preprocessing these data requires considerable bioinformatic expertise and computational infrastructure. Analyzing multiple datasets with a consistent computational workflow increases the accuracy of downstream meta-analyses. This collection of datasets represents the human intracellular transcriptional response to disorders and diseases such as acute lymphoblastic leukemia (ALL), B-cell lymphomas, chronic obstructive pulmonary disease (COPD), colorectal cancer, lupus erythematosus; as well as infection with pathogens including Borrelia burgdorferi , hantavirus, influenza A virus, Middle East respiratory syndrome coronavirus (MERS-CoV), Streptococcus pneumoniae , respiratory syncytial virus (RSV), severe acute respiratory syndrome coronavirus (SARS-CoV), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We calculated the statistically significant differentially expressed genes and Gene Ontology terms for all datasets. In addition, a subset of the datasets also includes results from splice variant analyses, intracellular signaling pathway enrichments as well as read mapping and quantification. All analyses were performed using well-established algorithms and are provided to facilitate future data mining activities, wet lab studies, and to accelerate collaboration and discovery.

Keywords: transcriptomics; RNA-sequencing; autoimmune diseases; cancer; pathogens; bacteria; viruses; data preprocessing (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/6/7/75/pdf (application/pdf)
https://www.mdpi.com/2306-5729/6/7/75/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:6:y:2021:i:7:p:75-:d:594584

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jdataj:v:6:y:2021:i:7:p:75-:d:594584