EconPapers    
Economics at your fingertips  
 

A Model Based Background Adjustment for Oligonucleotide Expression Arrays

Zhijin Wu, Rafael Irizarry, Robert Gentleman, Francisco Martinez Murillo and Forrest Spencer
Additional contact information
Zhijin Wu: Johns Hopkins Bloomberg School of Public Health
Rafael Irizarry: Johns Hopkins Bloomberg School of Public Health
Robert Gentleman: Dana-Farber Cancer Institute
Francisco Martinez Murillo: Johns Hopkins Medical Institute
Forrest Spencer: Johns Hopkins Medical Institute

No 1001, Johns Hopkins University Dept. of Biostatistics Working Paper Series from Berkeley Electronic Press

Abstract: High density oligonucleotide expression arrays are widely used in many areas of biomedical research. Affymetrix GeneChip arrays are the most popular. In the Affymetrix system, a fair amount of further pre-processing and data reduction occurs following the image processing step. Statistical procedures developed by academic groups have been successful at improving the default algorithms provided by the Affymetrix system. In this paper we present a solution to one of the pre-processing steps, background adjustment, based on a formal statistical framework. Our solution greatly improves the performance of the technology in various practical applications.Affymetrix GeneChip arrays use short oligonucleotides to probe for genes in an RNA sample. Typically each gene will be represented by 11-20 pairs of oligonucleotide probes. The first component of these pairs is referred to as a perfect match probe and is designed to hybridize only with transcripts from the intended gene (specific hybridization). However, hybridization by other sequences (non-specific hybridization) is unavoidable. Furthermore, hybridization strengths are measured by a scanner that introduces optical noise. Therefore, the observed intensities need to be adjusted to give accurate measurements of specific hybridization. One approach to adjusting is to pair each perfect match probe with a mismatch probe that is designed with the intention of measuring non-specific hybridization. The default adjustment, provided as part of the Affymetrix system, is based on the difference between perfect match and mismatch probe intensities. We have found that this approach can be improved via the use of estimators derived from a statistical model that use probe sequence information. The model is based on simple hybridization theory from molecular biology and experiments specifically designed to help develop it.A final step in the pre-processing of these arrays is to combine the 11-20 probe pair intensities,after background adjustment and normalization, for a given gene to define a measure of expression that represents the amount of the corresponding mRNA species. In this paper we illustrate the practical consequences of not adjusting appropriately for the presence of nonspecific hybridization and provide a solution based on our background adjustment procedure. Software that computes our adjustment is available as part of the Bioconductor project (http://www.bioconductor.

Keywords: Affymetric Gene Chip; Empirical Bayes; Sequence Information; GC Content (search for similar items in EconPapers)
Date: 2004-07-11
Note: oai:bepress.com:jhubiostat-1001
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (17)

Downloads: (external link)
http://www.bepress.com/cgi/viewcontent.cgi?article=1001&context=jhubiostat (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bep:jhubio:1001

Access Statistics for this paper

More papers in Johns Hopkins University Dept. of Biostatistics Working Paper Series from Berkeley Electronic Press
Bibliographic data for series maintained by Christopher F. Baum ().

 
Page updated 2025-03-19
Handle: RePEc:bep:jhubio:1001