Data Preprocessing
Christo El Morr,
Manar Jammal,
Hossam Ali-Hassan and
Walid El-Hallak ()
Additional contact information
Christo El Morr: York University
Manar Jammal: York University
Hossam Ali-Hassan: York University, Glendon Campus
Walid El-Hallak: Ontario Health
Chapter Chapter 4 in Machine Learning for Practical Decision Making, 2022, pp 117-163 from Springer
Abstract:
Abstract Preprocessing is the practice of cleaning, altering, and reorganizing raw data prior to processing and analysis, which is also known as data preparation [1]. It is an important step before processing and usually entails reformatting, adjusting, and integrating datasets to improve the information contained within them. Even though data preprocessing can be an onerous task, it is necessary as a precondition for putting data into context and reducing the possibility of bias [2]. An Aberdeen Group study states that data preprocessing refers to any activity taken in order to improve the quality, usability, accessibility, and portability of data [3]. In a poll published in Forbes, data scientists reported that they spend 60% of their time on data preprocessing (Fig. 4.1).
Date: 2022
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:isochp:978-3-031-16990-8_4
Ordering information: This item can be ordered from
http://www.springer.com/9783031169908
DOI: 10.1007/978-3-031-16990-8_4
Access Statistics for this chapter
More chapters in International Series in Operations Research & Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().