EconPapers    
Economics at your fingertips  
 

Ethnicity‐based name partitioning for author name disambiguation using supervised machine learning

Jinseok Kim, Jenna Kim and Jason Owen‐Smith

Journal of the Association for Information Science & Technology, 2021, vol. 72, issue 8, 979-994

Abstract: In several author name disambiguation studies, some ethnic name groups such as East Asian names are reported to be more difficult to disambiguate than others. This implies that disambiguation approaches might be improved if ethnic name groups are distinguished before disambiguation. We explore the potential of ethnic name partitioning by comparing performance of four machine learning algorithms trained and tested on the entire data or specifically on individual name groups. Results show that ethnicity‐based name partitioning can substantially improve disambiguation performance because the individual models are better suited for their respective name group. The improvements occur across all ethnic name groups with different magnitudes. Performance gains in predicting matched name pairs outweigh losses in predicting nonmatched pairs. Feature (e.g., coauthor name) similarities of name pairs vary across ethnic name groups. Such differences may enable the development of ethnicity‐specific feature weights to improve prediction for specific ethic name categories. These findings are observed for three labeled data with a natural distribution of problem sizes as well as one in which all ethnic name groups are controlled for the same sizes of ambiguous names. This study is expected to motive scholars to group author names based on ethnicity prior to disambiguation.

Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://doi.org/10.1002/asi.24459

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bla:jinfst:v:72:y:2021:i:8:p:979-994

Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=2330-1635

Access Statistics for this article

More articles in Journal of the Association for Information Science & Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().

 
Page updated 2025-03-19
Handle: RePEc:bla:jinfst:v:72:y:2021:i:8:p:979-994