EconPapers    
Economics at your fingertips  
 

Effect of class imbalance in heterogeneous network embedding: An empirical study

Akash Anil and Sanasam Ranbir Singh

Journal of Informetrics, 2020, vol. 14, issue 2

Abstract: Network science has been extensively explored in solving various bibliometrics tasks such as Co-authorship prediction, Author classification, Author clustering, Author ranking, Paper ranking, etc. While majority of the past studies exploit homogeneous bibliographic network (consists of singular type of nodes and edges), in recent past there is a surge in using heterogeneous bibliographic entities and their inter-dependencies using heterogeneous information networks (HIN). Unlike homogeneous bibliographic networks, a bibliographic HIN consists of multi-typed nodes such as Author, Paper, Venue, etc. and corresponding relations. Thus bibliographic HIN is more complex and captures rich semantics of underlying bibliographic data as well as poses more challenges. Since a real-world HIN may have different number of instances for different node types, class imbalance is ubiquitous. Recent studies discuss class imbalance in brief and exploit meta-path-based strategies to address the issue. However, there is no work which quantitatively study the effect of class imbalance in regards to solving real-world bibliometrics tasks. Therefore, this paper first proposes a metric to estimate class imbalance in HIN and study the effects of class imbalance over two bibliometrics tasks, namely (i) Co-authorship prediction and (ii) Author's research area classification, using node features generated by network embedding-based frameworks for DBLP dataset. From various experimental analysis, it is evident that class imbalance in bibliographic HIN is an inherent characteristic and for better performance of the above-mentioned bibliometrics tasks, the bibliographic HINs must consider Author, Paper, and Venue as node types.

Keywords: Heterogeneous information network; Network embedding; Meta-path; Class imbalance (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1751157719301051
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:infome:v:14:y:2020:i:2:s1751157719301051

DOI: 10.1016/j.joi.2020.101009

Access Statistics for this article

Journal of Informetrics is currently edited by Leo Egghe

More articles in Journal of Informetrics from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:infome:v:14:y:2020:i:2:s1751157719301051