Classification error in multiclass discrimination from Markov data

Christensen, Sören; Irle, Albrecht; Willert, Lars

Classification error in multiclass discrimination from Markov data

Sören Christensen (), Albrecht Irle () and Lars Willert ()
Additional contact information
Sören Christensen: Chalmers University of Technology and Göteborg University
Albrecht Irle: University of Kiel
Lars Willert: University of Kiel

Statistical Inference for Stochastic Processes, 2016, vol. 19, issue 3, No 3, 336 pages

Abstract: Abstract As a model for an on-line classification setting we consider a stochastic process $$(X_{-n},Y_{-n})_{n}$$ ( X - n , Y - n ) n , the present time-point being denoted by 0, with observables $$\ldots ,X_{-n},X_{-n+1}, \ldots , X_{-1}, X_0$$ … , X - n , X - n + 1 , … , X - 1 , X 0 from which the pattern $$Y_0$$ Y 0 is to be inferred. So in this classification setting, in addition to the present observation $$X_0$$ X 0 a number l of preceding observations may be used for classification, thus taking a possible dependence structure into account as it occurs e.g. in an ongoing classification of handwritten characters. We treat the question how the performance of classifiers is improved by using such additional information. For our analysis, a hidden Markov model is used. Letting $$R_l$$ R l denote the minimal risk of misclassification using l preceding observations we show that the difference $$\sup _k |R_l - R_{l+k}|$$ sup k | R l - R l + k | decreases exponentially fast as l increases. This suggests that a small l might already lead to a noticeable improvement. To follow this point we look at the use of past observations for kernel classification rules. Our practical findings in simulated hidden Markov models and in the classification of handwritten characters indicate that using $$l=1$$ l = 1 , i.e. just the last preceding observation in addition to $$X_0$$ X 0 , can lead to a substantial reduction of the risk of misclassification. So, in the presence of stochastic dependencies, we advocate to use $$ X_{-1},X_0$$ X - 1 , X 0 for finding the pattern $$Y_0$$ Y 0 instead of only $$X_0$$ X 0 as one would in the independent situation.

Keywords: Optimal classification; Asymptotic risk; Hidden Markov model (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11203-015-9129-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sistpr:v:19:y:2016:i:3:d:10.1007_s11203-015-9129-6

Ordering information: This journal article can be ordered from
http://www.springer. ... ty/journal/11203/PS2

DOI: 10.1007/s11203-015-9129-6

Access Statistics for this article

Statistical Inference for Stochastic Processes is currently edited by Denis Bosq, Yury A. Kutoyants and Marc Hallin

More articles in Statistical Inference for Stochastic Processes from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().