Economics at your fingertips  

A novel methodology to disambiguate organization names: an application to EU Framework Programmes data

Andrea Ancona (), Roy Cerqueti () and Gianluca Vagnani ()
Additional contact information
Andrea Ancona: Sapienza University of Rome
Roy Cerqueti: Sapienza University of Rome
Gianluca Vagnani: Sapienza University of Rome

Scientometrics, 2023, vol. 128, issue 8, No 11, 4447-4474

Abstract: Abstract The concept of collaborative R&D has been increasing interest among scholars and policy-makers, making collaboration a pivotal determinant to innovate nowadays. The availability of reliable data is a necessary condition to obtain valuable results. Specifically, in a collaborative environment, we must avoid mistaken identities among organizations. In many datasets, indeed, the same organization can appear in a non-univocal way. Thus its information is shared among multiple entities. In this work, we propose a novel methodology to disambiguate organization names. In particular, we combine supervised and unsupervised techniques to design a “hybrid” methodology that is neither fully automated nor completely manual, and easy to adapt to many different datasets. Thus, the flexibility and potential scalability of the methodology make this paper a worthwhile contribution to different research fields. We provide an empirical application of the methodology to the dataset of participants in projects funded by the first three European Framework Programmes. This choice is because we can test the quality of our procedure by comparing the refined dataset it returns to a well-recognized benchmark (i.e., the EUPRO database) in terms of the connection structure of the collaborative networks. Our results show the advantages of our approach based on the quality of the obtained dataset, and the efficiency of the designed methodology, leaving space for the integration of affiliation hierarchies in the future.

Keywords: Organization name disambiguation; Hybrid methodology; Institutions; Labels; Collaborative networks; EU Framework Programmes (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc

Downloads: (external link) Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link:

Ordering information: This journal article can be ordered from

DOI: 10.1007/s11192-023-04746-x

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

Page updated 2024-07-01
Handle: RePEc:spr:scient:v:128:y:2023:i:8:d:10.1007_s11192-023-04746-x