EconPapers    
Economics at your fingertips  
 

Improving the Quality of Linked Data Using Statistical Distributions

Heiko Paulheim and Christian Bizer
Additional contact information
Heiko Paulheim: Data and Web Science Group, University of Mannheim, Mannheim, Germany
Christian Bizer: Data and Web Science Group, University of Mannheim, Mannheim, Germany

International Journal on Semantic Web and Information Systems (IJSWIS), 2014, vol. 10, issue 2, 63-86

Abstract: Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.

Date: 2014
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 18/ijswis.2014040104 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jswis0:v:10:y:2014:i:2:p:63-86

Access Statistics for this article

International Journal on Semantic Web and Information Systems (IJSWIS) is currently edited by Brij Gupta

More articles in International Journal on Semantic Web and Information Systems (IJSWIS) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jswis0:v:10:y:2014:i:2:p:63-86