Large enough sample size to rank two groups of data reliably according to their means

Shen, Zhesi; Yang, Liying; Di, Zengru; Wu, Jinshan

Large enough sample size to rank two groups of data reliably according to their means

Zhesi Shen, Liying Yang, Zengru Di and Jinshan Wu ()
Additional contact information
Zhesi Shen: Chinese Academy of Sciences
Liying Yang: Chinese Academy of Sciences
Zengru Di: Beijing Normal University
Jinshan Wu: Beijing Normal University

Scientometrics, 2019, vol. 118, issue 2, No 14, 653-671

Abstract: Abstract Often we need to compare two sets of data, say X and Y, and often via comparing their means $$\mu _{X}$$ μ X and $$\mu _{Y}$$ μ Y . However, when two sets are highly overlapped (say for example $$\sqrt{\sigma ^{2}_{X}+\sigma ^{2}_{Y}}\gg \left| \mu _{X}-\mu _{Y}\right|$$ σ X 2 + σ Y 2 ≫ μ X - μ Y ), ranking the two sets according to their means might not be reliable. Based on the observation that replacing the one-by-one comparison, where we take one sample from each set at a time and compare the two samples, with the $$K_{X}$$ K X -by- $$K_{Y}$$ K Y comparison, where we take $$K_{X}$$ K X samples $$\left\{ x_{1}, x_{2}, \ldots , x_{K_{X}}\right\}$$ x 1 , x 2 , … , x K X from one set and $$K_{Y}$$ K Y samples $$\left\{ y_{1}, y_{2},\ldots , y_{K_{X}}\right\}$$ y 1 , y 2 , … , y K X from the other set at a time and compare the averages $$\frac{\sum _{j=1}^{K_{X}}x_{j}}{K_{X}}$$ ∑ j = 1 K X x j K X and $$\frac{\sum _{j=1}^{K_{Y}}y_{j}}{K_{Y}}$$ ∑ j = 1 K Y y j K Y , reduces the overlap and thus improves the reliability, we propose a definition of the minimum representative size $$\kappa$$ κ of each set for comparing sets by requiring roughly speaking $$\sqrt{\sigma ^{2}_{K_X}+\sigma ^{2}_{K_Y}}\ll \left| \mu _{X}-\mu _{Y}\right|$$ σ K X 2 + σ K Y 2 ≪ μ X - μ Y ). Applied to journal comparison, this minimum representative size $$\kappa$$ κ might be used as a complementary index to the journal impact factor (JIF) to indicate a measure of reliability of comparing two journals using their JIFs. Generally, this idea of minimum representative size can be used when any two sets of data with overlapping distributions are compared.

Keywords: Journal impact factor; Minimum representative size; Bootstrap sampling (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s11192-018-2995-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:118:y:2019:i:2:d:10.1007_s11192-018-2995-0

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-018-2995-0

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().