Double-Scan Statistics

Naus, J. I.; Stefanov, V. T.

Double-Scan Statistics

J. I. Naus () and V. T. Stefanov ()
Additional contact information
J. I. Naus: Rutgers University
V. T. Stefanov: The University of Western Australia

Methodology and Computing in Applied Probability, 2002, vol. 4, issue 2, 163-180

Abstract: Abstract Researchers frequently scan sequences for unusual clustering of events. Glaz et al. (2001) survey scan statistic tools developed for these analyses. Many of these tools deal with clustering of one type of event. In other applications the researcher scans for clusters of two types of events, A and B. Consider a sequence of D independent and identically distributed trials where each trial has one of four possible outcomes: A c ∩ B c , A ∩ B c , A c ∩ B, A ∩ B. When the events A and B occur within d consecutive trials, we say that a two-type d-cluster has occurred (a directional cluster is also defined that requires that the A event comes at least as early as the B event). Naus and Wartenberg (1997) develop a double scan statistic that counts the number of declumped (a type of non-overlapping) clusters that contain at least one of each of two different types of events. They derived the expectation and variance and Poisson approximation for the distribution of the double scan statistic. The approximation and declumping methods used work well when the events are relatively rare but not as well for the case where the two types of events occur with high frequency. This paper develops an alternative family of double scan statistics to count the number of non-overlapping two-type d-clusters. These new double scan statistics behave similarly to the Naus-Wartenberg statistic for rare events, but capture other information for the more dense event case. Exact and approximate results are derived for the distribution of the new double scan statistics, allowing its use for a wider range of density of events. The double scan statistics are compared for the epidemiologic application in Naus and Wartenberg, and for a molecular biology application involving genome versus genome protein hits.

Keywords: scan statistic; multiple outcomes; clusters; two-types of events; double-scan (search for similar items in EconPapers)
Date: 2002
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1023/A:1020641624294 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:metcap:v:4:y:2002:i:2:d:10.1023_a:1020641624294

Ordering information: This journal article can be ordered from
https://www.springer.com/journal/11009

DOI: 10.1023/A:1020641624294

Access Statistics for this article

Methodology and Computing in Applied Probability is currently edited by Joseph Glaz

More articles in Methodology and Computing in Applied Probability from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().