Accounting for undetected compounds in statistical analyses of mass spectrometry ‘omic studies
Taylor Sandra L. (),
Leiserowitz Gary S. and
Kim Kyoungmi
Additional contact information
Taylor Sandra L.: Division of Biostatistics, Department of Public Health Sciences, University of California School of Medicine, Davis, CA, USA
Leiserowitz Gary S.: Division of Gynecologic Oncology, UC Davis Medical Center, Sacramento, CA, USA
Kim Kyoungmi: Division of Biostatistics, Department of Public Health Sciences, University of California School of Medicine, Davis, CA, USA
Statistical Applications in Genetics and Molecular Biology, 2013, vol. 12, issue 6, 703-722
Abstract:
Mass spectrometry is an important high-throughput technique for profiling small molecular compounds in biological samples and is widely used to identify potential diagnostic and prognostic compounds associated with disease. Commonly, this data generated by mass spectrometry has many missing values resulting when a compound is absent from a sample or is present but at a concentration below the detection limit. Several strategies are available for statistically analyzing data with missing values. The accelerated failure time (AFT) model assumes all missing values result from censoring below a detection limit. Under a mixture model, missing values can result from a combination of censoring and the absence of a compound. We compare power and estimation of a mixture model to an AFT model. Based on simulated data, we found the AFT model to have greater power to detect differences in means and point mass proportions between groups. However, the AFT model yielded biased estimates with the bias increasing as the proportion of observations in the point mass increased while estimates were unbiased with the mixture model except if all missing observations came from censoring. These findings suggest using the AFT model for hypothesis testing and mixture model for estimation. We demonstrated this approach through application to glycomics data of serum samples from women with ovarian cancer and matched controls.
Keywords: accelerated failure time model; glycomics; mass spectrometry; metabolomics; missing values; point-mass mixture (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/sagmb-2013-0021 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:12:y:2013:i:6:p:703-722:n:4
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/sagmb/html
DOI: 10.1515/sagmb-2013-0021
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().