EconPapers    
Economics at your fingertips  
 

Identification of Allelic Imbalance with a Statistical Model for Subtle Genomic Mosaicism

Rui Xia, Selina Vattathil and Paul Scheet

PLOS Computational Biology, 2014, vol. 10, issue 8, 1-11

Abstract: Genetic heterogeneity in a mixed sample of tumor and normal DNA can confound characterization of the tumor genome. Numerous computational methods have been proposed to detect aberrations in DNA samples from tumor and normal tissue mixtures. Most of these require tumor purities to be at least 10–15%. Here, we present a statistical model to capture information, contained in the individual's germline haplotypes, about expected patterns in the B allele frequencies from SNP microarrays while fully modeling their magnitude, the first such model for SNP microarray data. Our model consists of a pair of hidden Markov models—one for the germline and one for the tumor genome—which, conditional on the observed array data and patterns of population haplotype variation, have a dependence structure induced by the relative imbalance of an individual's inherited haplotypes. Together, these hidden Markov models offer a powerful approach for dealing with mixtures of DNA where the main component represents the germline, thus suggesting natural applications for the characterization of primary clones when stromal contamination is extremely high, and for identifying lesions in rare subclones of a tumor when tumor purity is sufficient to characterize the primary lesions. Our joint model for germline haplotypes and acquired DNA aberration is flexible, allowing a large number of chromosomal alterations, including balanced and imbalanced losses and gains, copy-neutral loss-of-heterozygosity (LOH) and tetraploidy. We found our model (which we term J-LOH) to be superior for localizing rare aberrations in a simulated 3% mixture sample. More generally, our model provides a framework for full integration of the germline and tumor genomes to deal more effectively with missing or uncertain features, and thus extract maximal information from difficult scenarios where existing methods fail.Author Summary: Allelic imbalance, or a deviation from the expected 1-to-1 ratio of alleles where both were present in the germline, can result when there has been an acquired deletion or duplication of part of a chromosome and is a hallmark of cancer genomes. Tumor genomic profiling studies often involve analysis of samples that contain aberrant tumor cells mixed with normal cells without these acquired mutations. Methods for detecting chromosomal aberrations that result in allelic imbalance within a heterogeneous sample have previously been proposed that use the dispersion of within-sample allele frequencies measured at germline heterozygous positions. Here we demonstrate that combining this information with a measure for the correlation in these dispersions, due to the imbalance of one of the chromosomes, provides the most powerful approach. Our method allows for sensitive identification of short allelic imbalance events (e.g. 10 Mb) contained in as few as 3% of the cells in a heterogeneous mixture. Applications include profiling tumor genomes following surgical resection where there exists high contamination of normal tissue and identifying aberrations in subclones. Our work provides a framework for further development of methods that use observed data and population genetic theory for inference of allelic imbalance.

Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003765 (text/html)
https://journals.plos.org/ploscompbiol/article/fil ... 03765&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pcbi00:1003765

DOI: 10.1371/journal.pcbi.1003765

Access Statistics for this article

More articles in PLOS Computational Biology from Public Library of Science
Bibliographic data for series maintained by ploscompbiol ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pcbi00:1003765