Double truncation method for controlling local false discovery rate in case of spiky null
Shinjune Kim,
Youngjae Oh,
Johan Lim,
DoHwan Park,
Erin M. Green,
Mark L. Ramos and
Jaesik Jeong ()
Additional contact information
Shinjune Kim: Inha University Hospital
Youngjae Oh: Korea National Statistics Office
Johan Lim: Seoul National University
DoHwan Park: University of Maryland
Erin M. Green: University of Maryland
Mark L. Ramos: National Institute of Health
Jaesik Jeong: Chonnam National University
Computational Statistics, 2025, vol. 40, issue 2, No 7, 745-766
Abstract:
Abstract Many multiple test procedures, which control the false discovery rate, have been developed to identify some cases (e.g. genes) showing statistically significant difference between two different groups. However, a common issue encountered in some practical data sets is the presence of highly spiky null distributions. Existing methods struggle to control type I error in such cases due to the “inflated false positives," but this problem has not been addressed in previous literature. Our team recently encountered this issue while analyzing SET4 gene deletion data and proposed modeling the null distribution using a scale mixture normal distribution. However, the use of this approach is limited due to strong assumptions on the spiky peak. In this paper, we present a novel multiple test procedure that can be applied to any type of spiky peak data, including situations with no spiky peak or with one or two spiky peaks. Our approach involves truncating the central statistics around 0, which primarily contribute to the null spike, as well as the two tails that may be contaminated by alternative distributions. We refer to this method as the “double truncation method." After applying double truncation, we estimate the null density using the doubly truncated maximum likelihood estimator. We demonstrate numerically that our proposed method effectively controls the false discovery rate at the desired level using simulated data. Furthermore, we apply our method to two real data sets, namely the SET protein data and peony data.
Keywords: Doubly truncated maximum likelihood estimator; Local false discovery rate (FDR); Multiple testing; Tail-area FDR; SET protein data; Spiky null (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s00180-024-01510-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:40:y:2025:i:2:d:10.1007_s00180-024-01510-4
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s00180-024-01510-4
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().