LINEAR DISCRIMINANT RULES for HIGH-DIMENSIONAL CORRELATED DATA: ASYMPTOTIC and FINITE SAMPLE RESULTS
Pedro Duarte Silva ()
Additional contact information
Pedro Duarte Silva: Faculdade de Economia e Gestão, Universidade Católica Portuguesa - Porto
No 9, Working Papers de Gestão (Management Working Papers) from Católica Porto Business School, Universidade Católica Portuguesa
Abstract:
A new class of linear discrimination rules, designed for problems with many correlated variables, is proposed. This proposal tries to incorporate the most important patterns revealed by the empirical correlations and accurately approximate the optimal Bayes rule as the number of variables increases. In order to achieve this goal, the new rules rely on covariance matrix estimates derived from Gaussian factor models with small intrinsic dimensionality. Asymptotic results, based on a analysis that allows the number of variables to grow faster than the number of observations, show that the worst possible expected error rate of the proposed rules converges to the error of the optimal Bayes rule when the postulated model is true, and to a slightly larger constant when this model is a close approximation to the data generating process. Simulation results suggest that, in the data conditions they were designed for, the new rules can clearly outperform both Fisher's and naive linear discriminant rules.
Keywords: Discriminant Analysis; High Dimensionality; Expected Misclassification Rate; Min-Max Regret (search for similar items in EconPapers)
Pages: 17 pages
Date: 2009-05
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.feg.porto.ucp.pt/docentes/repec/WP/0920 ... riminant%20rules.pdf First version (application/pdf)
Our link check indicates that this URL is bad, the error code is: 500 Can't connect to www.feg.porto.ucp.pt:80 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cap:mpaper:092009
Access Statistics for this paper
More papers in Working Papers de Gestão (Management Working Papers) from Católica Porto Business School, Universidade Católica Portuguesa Contact information at EDIRC.
Bibliographic data for series maintained by Ricardo Goncalves ().