Gene Expression Datasets for Two Versions of the Saccharum spontaneum AP85-441 Genome
Nicolás López-Rozo,
Mauricio Ramirez-Castrillon,
Miguel Romero,
Jorge Finke and
Camilo Rocha ()
Additional contact information
Nicolás López-Rozo: Department of Electronics and Computer Science, Pontificia Universidad Javeriana, Cali 760031, Colombia
Mauricio Ramirez-Castrillon: OMICAS Program, Pontificia Universidad Javeriana, Cali 760031, Colombia
Miguel Romero: Department of Electronics and Computer Science, Pontificia Universidad Javeriana, Cali 760031, Colombia
Jorge Finke: Department of Electronics and Computer Science, Pontificia Universidad Javeriana, Cali 760031, Colombia
Camilo Rocha: Department of Electronics and Computer Science, Pontificia Universidad Javeriana, Cali 760031, Colombia
Data, 2022, vol. 8, issue 1, 1-9
Abstract:
Sugarcane is a species of tall grass with high biomass and sucrose production, and the world’s largest crop by production quantity. Its evolutionary environment adaptation and anthropogenic breeding response have resulted in a complex autopolyploid genome. Few efforts have been reported in the literature to document this organism’s gene co-expression and annotation, and, when available, use different gene identifiers that cannot be easily associated across studies. This data descriptor paper presents a dataset that consolidates expression matrices of two Saccharum spontaneum AP85-441 genome versions and an algorithm implemented in Python to mechanically obtain this dataset. The data are processed from the allele-level information of the two sources, with BLASTn used bidirectionally to suggest feasible mappings between the two sets of alleles, and a graph-matching optimization algorithm to maximize global identity and uniqueness of genes. Association tables are used to consolidate the expression values from alleles to genes. The contributed expression matrices comprise 96 experiments and 109,050 and 35,516 from the two genome versions. They can represent significant computational cost reduction for further research on, e.g., sugarcane co-expression network generation, functional annotation prediction, and stress-specific gene identification.
Keywords: sugarcane; expression matrix; allele expression; graph flow (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2306-5729/8/1/1/pdf (application/pdf)
https://www.mdpi.com/2306-5729/8/1/1/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:8:y:2022:i:1:p:1-:d:1008497
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().