Which similarity measure to use in network analysis: Impact of sample size on phi correlation coefficient and Ochiai index
Pankush Kalgotra,
Ramesh Sharda and
Andy Luse
International Journal of Information Management, 2020, vol. 55, issue C
Abstract:
Some networks are explicit where members make direct connections (e.g. Facebook network), whereas other networks are implicit (e.g. co-citation network) in which an edge between two nodes is inferred using a similarity index. Choosing the right index to infer connections in an implicit/inferred network is important because conclusions can be biased if a network does not represent true relationships. In this study, we compared two indexes: Phi Correlation Coefficient (PCC) and Ochiai Coefficient (Och) based on their sensitivity to the sample size of transactions from where the network is inferred. For demonstration, we used an implicit network, called a comorbidity network, developed from health records of 22.1 million patients. The networks were compared based on their overall topologies and node centralities. Results showed that the network formed using Och was more robust to the sample size than PCC. The network using Och followed a small-world topology irrespective of the sample size whereas the structure of a network using PCC was inconsistent. Regarding node centralities, the betweenness centrality was most affected by the sample size. Our results lead us to recommend Och over PCC.
Keywords: Implicit network; Analytics; Comorbidity network; Sample size; Inferred network; Ochiai coefficient; Phi correlation coefficient (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0268401220314286
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:ininma:v:55:y:2020:i:c:s0268401220314286
DOI: 10.1016/j.ijinfomgt.2020.102229
Access Statistics for this article
International Journal of Information Management is currently edited by Yogesh K. Dwivedi
More articles in International Journal of Information Management from Elsevier
Bibliographic data for series maintained by Catherine Liu ().