A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type
Greenwood Celia M.T.,
Abdous Belkacem and
Oualkacha Karim ()
Additional contact information
Lakhal-Chaieb Lajmi: Département de mathématiques et statistique, Université Laval, Québec, Québec G1V 0A6, Canada
Greenwood Celia M.T.: Lady Davis Research Institute, Montréal, Québec H3T 1E2, Canada; and Departments of Oncology, Epidemiology, Biostatistics and Occupational Health, and Human Genetics, McGill University, Montréal, Québec H3A 1A2, Canada
Ouhourane Mohamed: Département de mathématiques, Université de Québec À Montréal, Québec H2X 3Y7, Canada
Zhao Kaiqiong: Lady Davis Research Institute, Montréal, Québec H3T 1E2, Canada; and Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Québec H3A 1A2, Canada
Abdous Belkacem: Département de médecine sociale et préventive, Université Laval, Québec, Québec G1V 0A6, Canada
Oualkacha Karim: Département de mathématiques, Université de Québec À Montréal, Montréal, Québec H2X 3Y7, Canada
Statistical Applications in Genetics and Molecular Biology, 2017, vol. 16, issue 5-6, 333-347
We consider the assessment of DNA methylation profiles for sequencing-derived data from a single cell type or from cell lines. We derive a kernel smoothed EM-algorithm, capable of analyzing an entire chromosome at once, and to simultaneously correct for experimental errors arising from either the pre-treatment steps or from the sequencing stage and to take into account spatial correlations between DNA methylation profiles at neighbouring CpG sites. The outcomes of our algorithm are then used to (i) call the true methylation status at each CpG site, (ii) provide accurate smoothed estimates of DNA methylation levels, and (iii) detect differentially methylated regions. Simulations show that the proposed methodology outperforms existing analysis methods that either ignore the correlation between DNA methylation profiles at neighbouring CpG sites or do not correct for errors. The use of the proposed inference procedure is illustrated through the analysis of a publicly available data set from a cell line of induced pluripotent H9 human embryonic stem cells and also a data set where methylation measures were obtained for a small genomic region in three different immune cell types separated from whole blood.
Keywords: DNA methylation; EM-algorithm; Kernel smoothing; MethylC-Seq; RRBS (search for similar items in EconPapers)
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
For access to full text, subscription to the journal or payment for the individual article is required.
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:bpj:sagmbi:v:16:y:2017:i:5-6:p:333-347:n:3
Ordering information: This journal article can be ordered from
Access Statistics for this article
Statistical Applications in Genetics and Molecular Biology is currently edited by Michael P. H. Stumpf
More articles in Statistical Applications in Genetics and Molecular Biology from De Gruyter
Bibliographic data for series maintained by Peter Golla ().