Economics at your fingertips  

Sample size calculation based on generalized linear models for differential expression analysis in RNA-seq data

Li Chung-I () and Shyr Yu
Additional contact information
Li Chung-I: Department of Statistics, National Cheng Kung University, Tainan 701, Taiwan, Province of China
Shyr Yu: Center for Quantitative Sciences, Vanderbilt University, 571 Preston Building, Nashville, TN, United States of America

Statistical Applications in Genetics and Molecular Biology, 2016, vol. 15, issue 6, 491-505

Abstract: As RNA-seq rapidly develops and costs continually decrease, the quantity and frequency of samples being sequenced will grow exponentially. With proteomic investigations becoming more multivariate and quantitative, determining a study’s optimal sample size is now a vital step in experimental design. Current methods for calculating a study’s required sample size are mostly based on the hypothesis testing framework, which assumes each gene count can be modeled through Poisson or negative binomial distributions; however, these methods are limited when it comes to accommodating covariates. To address this limitation, we propose an estimating procedure based on the generalized linear model. This easy-to-use method constructs a representative exemplary dataset and estimates the conditional power, all without requiring complicated mathematical approximations or formulas. Even more attractive, the downstream analysis can be performed with current R/Bioconductor packages. To demonstrate the practicability and efficiency of this method, we apply it to three real-world studies, and introduce our on-line calculator developed to determine the optimal sample size for a RNA-seq study.

Keywords: generalized linear models; RNA-seq; sample size calculation (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link) (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Ordering information: This journal article can be ordered from

DOI: 10.1515/sagmb-2016-0008

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

Page updated 2021-06-12
Handle: RePEc:bpj:sagmbi:v:15:y:2016:i:6:p:491-505:n:3