Classification
Vladimir Shikhman (vladimir.shikhman@mathematik.tu-chemnitz.de) and
David Müller (david.mueller@mathematik.tu-chemnitz.de)
Additional contact information
Vladimir Shikhman: Chemnitz University of Technology
David Müller: Chemnitz University of Technology
Chapter 4 in Mathematical Foundations of Big Data Analytics, 2021, pp 63-85 from Springer
Abstract:
Abstract Classification is a process by which new objects, events, people, or experiences are assigned to some class on the basis of characteristics shared by members of the same class, and features distinguishing the members of one class from those of another. In the context of data science it is often necessary to categorize new, unlabeled information based upon its relevance to known, labeled data. Usual applications include credit investigation of a potential client in presence of the current or previous clients with disclosed financial history. Another important application deals with the analytical quality control. Here, a decision has to be made whether a patient is likely to be virus-infected by comparing own and other patients’ test results. In this chapter, we shall use linear classifiers to assign a newcomer to a particular class. This assignment depends on whether the corresponding performance of the newcomer exceeds a certain bound. Three types of linear classifiers are discussed. First, we introduce the statistically motivated Fisher’s discriminant. The latter maximizes the sample variance between the classes and minimizes the variance of data within the classes. The computation of Fisher’s discriminant leads to a nicely structured eigenvalue problem. Second, the celebrated support-vector machine is studied. It is geometrically motivated, and maximizes the margin between two classes. The detection of an optimal separating hyperplane is based on the convex duality. Third, the naïve Bayes classifier is derived. Rooted in the application of Bayes theorem, the latter is of probabilistic origin. Namely, the Bernoulli probabilities of an assignment to one or another class conditioned on the observed data are compared.
Date: 2021
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-662-62521-7_4
Ordering information: This item can be ordered from
http://www.springer.com/9783662625217
DOI: 10.1007/978-3-662-62521-7_4
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla (sonal.shukla@springer.com) and Springer Nature Abstracting and Indexing (indexing@springernature.com).