EconPapers    
Economics at your fingertips  
 

General power and sample size calculations for high-dimensional genomic data

Maarten van Iterson (), A. van de Wiel Mark, Boer Judith M. and X. de Menezes Renée
Additional contact information
Maarten van Iterson: Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
A. van de Wiel Mark: Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands
Boer Judith M.: Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands Erasmus Medical Centre, Sophia Children’s Hospital, Laboratory of Pediatric Oncology/Hematology, Rotterdam, The Netherlands Netherlands Bioinformatics Centre, Nijmegen
X. de Menezes Renée: Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands Netherlands Bioinformatics Centre, Nijmegen

Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 4, 449-467

Abstract: In the design of microarray or next-generation sequencing experiments it is crucial to choose the appropriate number of biological replicates. As often the number of differentially expressed genes and their effect sizes are small and too few replicates will lead to insufficient power to detect these. On the other hand, too many replicates unnecessary leads to high experimental costs. Power and sample size analysis can guide experimentalist in choosing the appropriate number of biological replicates. Several methods for power and sample size analysis have recently been proposed for microarray data. However, most of these are restricted to two group comparisons and require user-defined effect sizes. Here we propose a pilot-data based method for power and sample size analysis which can handle more general experimental designs and uses pilot-data to obtain estimates of the effect sizes. The method can also handle χ2 distributed test statistics which enables power and sample size calculations for a much wider class of models, including high-dimensional generalized linear models which are used, e.g., for RNA-seq data analysis. The performance of the method is evaluated using simulated and experimental data from several microarray and next-generation sequencing experiments. Furthermore, we compare our proposed method for estimation of the density of effect sizes from pilot data with a recent proposed method specific for two group comparisons.

Keywords: density of effect-sizes; discrete inverse problem; high-dimensional generalized linear models; non-negative Conjugate Gradients algorithm (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://doi.org/10.1515/sagmb-2012-0046 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:4:p:449-467:n:3

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.1515/sagmb-2012-0046

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:12:y:2013:i:4:p:449-467:n:3