EconPapers    
Economics at your fingertips  
 

An Unsupervised Entity Resolution Framework for English and Arabic Datasets

Abdelkrim Ouhab, Mimoun Malki, Djamel Berrabah and Faouzi Boufares
Additional contact information
Abdelkrim Ouhab: EEDIS Laboratory, DjillaliLiabes University, Sidi Bel Abbes, Algeria
Mimoun Malki: LabRI-SBA Laboratory, Ecole Supérieure en Informatique de Sidi Bel Abbes, Sidi Bel Abbes, Algeria
Djamel Berrabah: EEDIS Laboratory, DjillaliLiabes University, Sidi Bel Abbes, Algeria
Faouzi Boufares: LIPN Laboratory, Paris 13 University, Villetaneuse, France

International Journal of Strategic Information Technology and Applications (IJSITA), 2017, vol. 8, issue 4, 16-29

Abstract: Entity resolution (ER) is an important step in data integration and in many data mining projects; its goal is to identify records that refer to the same real-world entity. Most existing ER frameworks have focused on datasets in Latin-based languages and do not support Arabic language. In this article, the authors present an unsupervised ER framework that supports English and Arabic datasets. Rather than using matching rules developed by an expert or manually labeled training examples, the proposed framework automatically generates its own training set. The generated training set is then used to train a classifier and learn a classification model. Finally, the learned classification model is used to perform ER. The proposed framework was implemented and tested on three Arabic datasets and four English datasets. Experimental results show that the proposed framework is competitive with supervised approaches and outperform recently proposed unsupervised approaches in terms of F-measure.

Date: 2017
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 18/IJSITA.2017100102 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jsita0:v:8:y:2017:i:4:p:16-29

Access Statistics for this article

International Journal of Strategic Information Technology and Applications (IJSITA) is currently edited by Mehdi Khosrow-Pour

More articles in International Journal of Strategic Information Technology and Applications (IJSITA) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jsita0:v:8:y:2017:i:4:p:16-29