EconPapers    
Economics at your fingertips  
 

Integrating Large Citation Datasets

Inci Yueksel-Erguen (), Ida Litzel () and Hanqiu Peng ()
Additional contact information
Inci Yueksel-Erguen: Zuse Institute Berlin
Ida Litzel: Zuse Institute Berlin
Hanqiu Peng: National University of Singapore

A chapter in Operations Research Proceedings 2024, 2025, pp 46-52 from Springer

Abstract: Abstract This paper explores methods for building a comprehensive citation graph using big data techniques to evaluate scientific impact more accurately. Traditional citation metrics have limitations, and this work investigates merging large citation datasets to create a more accurate picture. Challenges of big data, like inconsistent data formats and lack of unique identifiers, are addressed through deduplication efforts, resulting in a streamlined and reliable merged dataset with over 119 million records and 1.4 billion citations. We demonstrate that merging large citation datasets builds a more accurate citation graph facilitating a more robust evaluation of scientific impact.

Keywords: big data preprocessing; data analytics; citation graphs (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lnopch:978-3-031-92575-7_7

Ordering information: This item can be ordered from
http://www.springer.com/9783031925757

DOI: 10.1007/978-3-031-92575-7_7

Access Statistics for this chapter

More chapters in Lecture Notes in Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-08-29
Handle: RePEc:spr:lnopch:978-3-031-92575-7_7