JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique
Nahla Mohammed Elzein,
Mazlina Abdul Majid,
Ibrahim Abaker Targio Hashem (),
Ashraf Osman Ibrahim (),
Anas W. Abulfaraj and
Faisal Binzagr
Additional contact information
Nahla Mohammed Elzein: Faculty of Computer Science, Future University, Khartoum 10553, Sudan
Mazlina Abdul Majid: Faculty of Computing, University Malaysia Pahang, Pekan 26600, Malaysia
Ibrahim Abaker Targio Hashem: Department of Computer Science, College of Computing and Informatics, University of Sharjah, Sharjah 27272, United Arab Emirates
Ashraf Osman Ibrahim: Data Science Programme, Faculty of Computing and Informatics, Universiti Malaysia Sabah, Kota Kinabalu 88400, Malaysia
Anas W. Abulfaraj: Department of Information Systems, King Abdulaziz University, P.O. Box 344, Rabigh 21911, Saudi Arabia
Faisal Binzagr: Department of Computer Science, King Abdulaziz University, P.O. Box 344, Rabigh 21911, Saudi Arabia
Mathematics, 2023, vol. 11, issue 5, 1-20
Abstract:
In the last decade, the volume of semantic data has increased exponentially, with the number of Resource Description Framework (RDF) datasets exceeding trillions of triples in RDF repositories. Hence, the size of RDF datasets continues to grow. However, with the increasing number of RDF triples, complex multiple RDF queries are becoming a significant demand. Sometimes, such complex queries produce many common sub-expressions in a single query or over multiple queries running as a batch. In addition, it is also difficult to minimize the number of RDF queries and processing time for a large amount of related data in a typical distributed environment encounter. To address this complication, we introduce a join query processing model for big RDF data, called JQPro. By adopting a MapReduce framework in JQPro, we developed three new algorithms, which are hash-join, sort-merge, and enhanced MapReduce-join for join query processing of RDF data. Based on an experiment conducted, the result showed that the JQPro model outperformed the two popular algorithms, gStore and RDF-3X, with respect to the average execution time. Furthermore, the JQPro model was also tested against RDF-3X, RDFox, and PARJs using the LUBM benchmark. The result showed that the JQPro model had better performance in comparison with the other models. In conclusion, the findings showed that JQPro achieved improved performance with 87.77% in terms of execution time. Hence, in comparison with the selected models, JQPro performs better.
Keywords: semantic web; distributed computing; RDF; big data; SPARKSQL (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/5/1275/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/5/1275/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:5:p:1275-:d:1089376
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().