EconPapers    
Economics at your fingertips  
 

Distributed Top-K Join Queries Optimizing for RDF Datasets

Jinguang Gu, Hao Dong, Zhao Liu and Fangfang Xu
Additional contact information
Jinguang Gu: College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China & Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China
Hao Dong: College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China & Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China
Zhao Liu: College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China & Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China
Fangfang Xu: College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, China & Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, China

International Journal of Web Services Research (IJWSR), 2017, vol. 14, issue 3, 67-83

Abstract: In recent years, the scale of RDF datasets is increasing rapidly, the query research on RDF datasets in the transitional centralized environment is unable to meet the increasing demand of data query field, especially the top-k query. Based on the Spark distributed computing system and the HBase distributed storage system, a novel method is proposed for top-k query. A top–k query plan STA (Spark Threshold Algorithm) is proposed to reduce the connection operation of RDF data. Furthermore, a better algorithm SSJA (Spark Simple Join Algorithm) is presented to reduce the sorting related operations for the intermediate data. A cache mechanism is also proposed to speed up the SSJA algorithm. The experimental results show that the SSJA algorithm performs better than the STA algorithm in term of the cost and applicability, and it can significantly improve the SSJA's performance by introducing the cache mechanism.

Date: 2017
References: Add references at CitEc
Citations:

Downloads: (external link)
http://services.igi-global.com/resolvedoi/resolve. ... 018/IJWSR.2017070105 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:igg:jwsr00:v:14:y:2017:i:3:p:67-83

Access Statistics for this article

International Journal of Web Services Research (IJWSR) is currently edited by Liang-Jie Zhang

More articles in International Journal of Web Services Research (IJWSR) from IGI Global
Bibliographic data for series maintained by Journal Editor ().

 
Page updated 2025-03-19
Handle: RePEc:igg:jwsr00:v:14:y:2017:i:3:p:67-83