Combining Graph Exploration and Fragmentation for Scalable RDF Query Processing

Khelil, Abdallah; Mesmoudi, Amin; Galicia, Jorge; Bellatreche, Ladjel; Hacid, Mohand-Saïd; Coquery, Emmanuel

Combining Graph Exploration and Fragmentation for Scalable RDF Query Processing

Abdallah Khelil, Amin Mesmoudi (), Jorge Galicia, Ladjel Bellatreche, Mohand-Saïd Hacid and Emmanuel Coquery
Additional contact information
Abdallah Khelil: LIAS/ISAE-ENSMA
Amin Mesmoudi: LIAS/University of Poitiers
Jorge Galicia: LIAS/ISAE-ENSMA
Ladjel Bellatreche: LIAS/ISAE-ENSMA
Mohand-Saïd Hacid: LIRIS/University of Lyon
Emmanuel Coquery: LIRIS/University of Lyon

Information Systems Frontiers, 2021, vol. 23, issue 1, No 11, 165-183

Abstract: Abstract The flexibility offered by the Resource Description Framework (RDF) has led it to become a very popular standard for representing data with an undefined or variable schema using the concept of triples. Its success has resulted in many large scale multidisciplinary datasets, that have prompted the development of efficient RDF processing systems. Current approaches can be distinguished into two groups: the first, adopting the relational model storing the triples in tables, and the second creating data structures that model RDF data as a graph. The strategies of the first group are more easily scalable since they apply optimization strategies from the relational model like indexing and fragmentation. However, these approaches suffer many overheads when dealing with complex queries (e.g. compounded SPARQL graphs involving filters) persistent in existing applications. On the other hand, graph-based systems that use more complex data structures fail to efficiently manage the main memory and are not scalable in computer hardware with limited resources. In this paper, we propose a novel approach to perform queries (Basic Graph Patterns, Wildcards, Aggregations and Sorting) on RDF data. We propose to combine both RDF graph exploration with physical fragmentation of triples. In this work, we describe our graph-based storage and query evaluation models. Then, we detail the architecture of our system and we largely explain the strategy, based in the Volcano execution model, used to manage the main memory at query runtime. We conducted extensive experiments on synthetic and real datasets to evaluate the efficiency of our proposal. We compared our performance with a relational-based (Virtuoso), a graph-based (gStore) and an intensive-indexing (RDF-3X) approach. According to our evaluation, our system offers the best compromise between efficient query processing and scalability.

Keywords: RDF; Graph exploration; Fragmentation; Scalability; Performance (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10796-020-09998-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:infosf:v:23:y:2021:i:1:d:10.1007_s10796-020-09998-z

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10796

DOI: 10.1007/s10796-020-09998-z

Access Statistics for this article

Information Systems Frontiers is currently edited by Ram Ramesh and Raghav Rao

More articles in Information Systems Frontiers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().