Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques
Sebastian Büsch,
Volker Nissen () and
Arndt Wünscher
Additional contact information
Sebastian Büsch: Ilmenau University of Technology
Volker Nissen: Ilmenau University of Technology
Arndt Wünscher: Ilmenau University of Technology
Information Systems Frontiers, 2017, vol. 19, issue 5, No 9, 1085-1099
Abstract:
Abstract The aim of Information Lifecycle Management (ILM) is to govern data throughout its lifecycle as efficiently as possible and effectively from technical points of view. A core aspect is the question, where the data should be stored, since different costs and access times are entailed. For this purpose data have to be classified, which presently is either done manually in an elaborate way, or with recourse to only a few data attributes, in particular access frequency. In the context of Data-Warehouse-Systems this article introduces an automated and therefore speedy and cost-effective data classification for ILM. Machine learning techniques, in particular an artificial neural network (multilayer perceptron), a support vector machine and a decision tree approach are compared on an SAP-based real-world data set from the automotive industry. This data classification considers a large number of data attributes and thus attains similar results akin to human experts. In this comparison of machine learning techniques, besides the accuracy of classification, also the types of misclassification that appear, are included, since this is important in ILM.
Keywords: Information lifecycle management; Machine learning; Computational intelligence; Artificial neural net; Multilayer perceptron; Automatic classification; Data warehouse; Business intelligence (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s10796-016-9680-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:infosf:v:19:y:2017:i:5:d:10.1007_s10796-016-9680-8
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10796
DOI: 10.1007/s10796-016-9680-8
Access Statistics for this article
Information Systems Frontiers is currently edited by Ram Ramesh and Raghav Rao
More articles in Information Systems Frontiers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().