EconPapers    
Economics at your fingertips  
 

Statistical Significance Threshold Criteria For Analysis of Microarray Gene Expression Data

Cheng Cheng, Pounds Stanley B., Boyett James M., Pei Deqing, Kuo Mei-Ling and Roussel Martine F.
Additional contact information
Cheng Cheng: Department of Biostatistics, St. Jude Children’s Research Hospital
Pounds Stanley B.: Department of Biostatistics, St. Jude Children’s Research Hospital
Boyett James M.: Department of Biostatistics, St. Jude Children’s Research Hospital
Pei Deqing: Department of Biostatistics, St. Jude Children’s Research Hospital
Kuo Mei-Ling: Department of Genetics and Tumor Cell Biology, St. Jude Children’s Research Hospital
Roussel Martine F.: Department of Genetics and Tumor Cell Biology, St. Jude Children’s Research Hospital

Statistical Applications in Genetics and Molecular Biology, 2004, vol. 3, issue 1, 32

Abstract: The methodological advancement in microarray data analysis on the basis of false discovery rate (FDR) control, such as the q-value plots, allows the investigator to examine the FDR from several perspectives. However, when FDR control at the ``customary" levels 0.01, 0.05, or 0.1 does not provide fruitful findings, there is little guidance for making the trade off between the significance threshold and the FDR level by sound statistical or biological considerations. Thus, meaningful statistical significance criteria that complement the existing FDR methods for large-scale multiple tests are desirable. Three statistical significance criteria, the profile information criterion, the total error proportion, and the guide-gene driven selection, are developed in this research. The first two are general significance threshold criteria for large-scale multiple tests; the profile information criterion is related to the recent theoretical studies of the connection between FDR control and minimax estimation, and the total error proportion is closely related to the asymptotic properties of FDR control in terms of the total error risk. The guide-gene driven selection is an approach to combining statistical significance and the existing biological knowledge of the study at hand. Error properties of these criteria are investigated theoretically and by simulation. The proposed methods are illustrated and compared using an example of genomic screening for novel Arf gene targets. Operating characteristics of q-value and the proposed significance threshold criteria are investigated and compared in a simulation study that employs a model mimicking a gene regulatory pathway. A guideline for using these criteria is provided. Splus/R code is available from the corresponding author upon request.

Keywords: multiple tests; significance threshold selection; profile information criterion; total error proportion; false discovery rate; q-value; microarray; gene expression (search for similar items in EconPapers)
Date: 2004
References: View complete reference list from CitEc
Citations: View citations in EconPapers (7)

Downloads: (external link)
https://doi.org/10.2202/1544-6115.1064 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:36

Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html

DOI: 10.2202/1544-6115.1064

Access Statistics for this article

Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf

More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bpj:sagmbi:v:3:y:2004:i:1:n:36