EconPapers    
Economics at your fingertips  
 

Megale: A Metadata-Driven Graph-Based System for Data Lake Exploration

Doulkifli Boukraa, Meriem Bouraoui (), Chaima Grine () and Racha Ouahab ()
Additional contact information
Doulkifli Boukraa: LaRIA Laboratory, Faculty of Exact Sciences and Computer Science, University of Jijel, Jijel 18000, Algeria
Meriem Bouraoui: ��Faculty of Exact Sciences and Computer Science, University of Jijel, Jijel 18000, Algeria
Chaima Grine: ��Faculty of Exact Sciences and Computer Science, University of Jijel, Jijel 18000, Algeria
Racha Ouahab: ��Faculty of Exact Sciences and Computer Science, University of Jijel, Jijel 18000, Algeria

International Journal of Information Technology & Decision Making (IJITDM), 2025, vol. 24, issue 01, 259-295

Abstract: Data lakes are storage repositories that contain large amounts of data (big data) in its native format; encompassing structured, semi-structured or unstructured. Data lakes are open to a wide range of use cases, such as carrying out advanced analytics and extracting knowledge patterns. However, the sheer dumping of data into a data lake would only lead to a data swamp. To prevent such a situation, enterprises can adopt best practices, among which to manage data lake metadata. A growing body of research has focused on proposing metadata systems and models for data lakes with a special interest on model genericness. However, existing models fail to cover all aspects of a data lake, due to their static modeling approach. Besides, they do not fully cover essential features for an effective metadata management, namely governance, visibility and uniform treatment of data lake concepts. In this paper, we propose a dynamic modeling approach to meet these features, based on two main constructs: data lake concept and data lake relationship. We showcase our approach by Megale, a graph-based metadata system for NoSQL data lake exploration. We present a proof-of-concept implementation of Megale and we show its effectiveness and efficiency in exploring the data lake.

Keywords: Data lake; metadata; model; schema; NoSQL (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219622024500135
Access to full text is restricted to subscribers

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijitdm:v:24:y:2025:i:01:n:s0219622024500135

Ordering information: This journal article can be ordered from

DOI: 10.1142/S0219622024500135

Access Statistics for this article

International Journal of Information Technology & Decision Making (IJITDM) is currently edited by Yong Shi

More articles in International Journal of Information Technology & Decision Making (IJITDM) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().

 
Page updated 2025-04-05
Handle: RePEc:wsi:ijitdm:v:24:y:2025:i:01:n:s0219622024500135