Data Protection and Multi-Database Data-Driven Models
Lili Jiang and
Vicenç Torra ()
Additional contact information
Lili Jiang: Department of Computing Science, Umeå University, SE-90187 Umeå, Sweden
Vicenç Torra: Department of Computing Science, Umeå University, SE-90187 Umeå, Sweden
Future Internet, 2023, vol. 15, issue 3, 1-13
Abstract:
Anonymization and data masking have effects on data-driven models. Different anonymization methods have been developed to provide a good trade-off between privacy guarantees and data utility. Nevertheless, the effects of data protection (e.g., data microaggregation and noise addition) on data integration and on data-driven models (e.g., machine learning models) built from these data are not known. In this paper, we study how data protection affects data integration, and the corresponding effects on the results of machine learning models built from the outcome of the data integration process. The experimental results show that the levels of protection that prevent proper database integration do not affect machine learning models that learn from the integrated database to the same degree. Concretely, our preliminary analysis and experiments show that data protection techniques have a lower level of impact on data integration than on machine learning models.
Keywords: anonymization; masking; data protection; data integration (search for similar items in EconPapers)
JEL-codes: O3 (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1999-5903/15/3/93/pdf (application/pdf)
https://www.mdpi.com/1999-5903/15/3/93/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jftint:v:15:y:2023:i:3:p:93-:d:1082188
Access Statistics for this article
Future Internet is currently edited by Ms. Grace You
More articles in Future Internet from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().