A Framework for Current and New Data Quality Dimensions: An Overview
Russell Miller,
Harvey Whelan,
Michael Chrubasik,
David Whittaker,
Paul Duncan and
João Gregório ()
Additional contact information
Russell Miller: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
Harvey Whelan: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
Michael Chrubasik: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
David Whittaker: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
Paul Duncan: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
João Gregório: Informatics, Data Science Department, National Physical Laboratory, Glasgow G1 1RD, UK
Data, 2024, vol. 9, issue 12, 1-26
Abstract:
This paper presents a comprehensive exploration of data quality terminology, revealing a significant lack of standardisation in the field. The goal of this work was to conduct a comparative analysis of data quality terminology across different domains and structure it into a hierarchical data model. We propose a novel approach for aggregating disparate data quality terms used to describe the multiple facets of data quality under common umbrella terms with a focus on the ISO 25012 standard. We introduce four additional data quality dimensions: governance, usefulness, quantity, and semantics. These dimensions enhance specificity, complementing the framework established by the ISO 25012 standard, as well as contribute to a broad understanding of data quality aspects. The ISO 25012 standard, a general standard for managing the data quality in information systems, offers a foundation for the development of our proposed Data Quality Data Model. This is due to the prevalent nature of digital systems across a multitude of domains. In contrast, frameworks such as ALCOA+, which were originally developed for specific regulated industries, can be applied more broadly but may not always be generalisable. Ultimately, the model we propose aggregates and classifies data quality terminology, facilitating seamless communication of the data quality between different domains when collaboration is required to tackle cross-domain projects or challenges. By establishing this hierarchical model, we aim to improve understanding and implementation of data quality practices, thereby addressing critical issues in various domains.
Keywords: data quality; data model; data quality dimensions; data traceability; confidence in data; data metrology; data uncertainty; data structures; big data; IoT (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2306-5729/9/12/151/pdf (application/pdf)
https://www.mdpi.com/2306-5729/9/12/151/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:9:y:2024:i:12:p:151-:d:1546270
Access Statistics for this article
Data is currently edited by Ms. Cecilia Yang
More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().