Nonparametric Multivariate Density Estimation: Case Study of Cauchy Mixture Model
Tomas Ruzgas,
Mantas Lukauskas and
Gedmantas Čepkauskas
Additional contact information
Tomas Ruzgas: Department of Applied Mathematics, Faculty of Mathematics and Natural Sciences, Kaunas University of Technology, 44249 Kaunas, Lithuania
Mantas Lukauskas: Department of Applied Mathematics, Faculty of Mathematics and Natural Sciences, Kaunas University of Technology, 44249 Kaunas, Lithuania
Gedmantas Čepkauskas: Department of Applied Mathematics, Faculty of Mathematics and Natural Sciences, Kaunas University of Technology, 44249 Kaunas, Lithuania
Mathematics, 2021, vol. 9, issue 21, 1-22
Abstract:
Estimation of probability density functions (pdf) is considered an essential part of statistical modelling. Heteroskedasticity and outliers are the problems that make data analysis harder. The Cauchy mixture model helps us to cover both of them. This paper studies five different significant types of non-parametric multivariate density estimation techniques algorithmically and empirically. At the same time, we do not make assumptions about the origin of data from any known parametric families of distribution. The method of the inversion formula is made when the cluster of noise is involved in the general mixture model. The effectiveness of the method is demonstrated through a simulation study. The relationship between the accuracy of evaluation and complicated multidimensional Cauchy mixture models (CMM) is analyzed using the Monte Carlo method. For larger dimensions ( d ~ 5) and small samples ( n ~ 50), the adaptive kernel method is recommended. If the sample is n ~ 100, it is recommended to use a modified inversion formula (MIDE). It is better for larger samples with overlapping distributions to use a semi-parametric kernel estimation and more isolated distribution-modified inversion methods. For the mean absolute percentage error, it is recommended to use a semi-parametric kernel estimation when the sample has overlapping distributions. In the smaller dimensions ( d = 2) and a sample is with overlapping distributions, it is recommended to use the semi-parametric kernel method (SKDE) and for isolated distributions, it is recommended to use modified inversion formula (MIDE). The inversion formula algorithm shows that with noise cluster, the results of the inversion formula improved significantly.
Keywords: Cauchy mixture model; nonparametric density estimation; density estimation algorithms; adapted kernel density estimate; logspline estimation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/9/21/2717/pdf (application/pdf)
https://www.mdpi.com/2227-7390/9/21/2717/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:9:y:2021:i:21:p:2717-:d:665127
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().