Quasar Identification Using Multivariate Probability Density Estimated from Nonparametric Conditional Probabilities
Jenny Farmer,
Eve Allen and
Donald J. Jacobs ()
Additional contact information
Jenny Farmer: Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
Eve Allen: Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
Donald J. Jacobs: Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
Mathematics, 2022, vol. 11, issue 1, 1-19
Abstract:
Nonparametric estimation for a probability density function that describes multivariate data has typically been addressed by kernel density estimation (KDE). A novel density estimator recently developed by Farmer and Jacobs offers an alternative high-throughput automated approach to univariate nonparametric density estimation based on maximum entropy and order statistics, improving accuracy over univariate KDE. This article presents an extension of the single variable case to multiple variables. The univariate estimator is used to recursively calculate a product array of one-dimensional conditional probabilities. In combination with interpolation methods, a complete joint probability density estimate is generated for multiple variables. Good accuracy and speed performance in synthetic data are demonstrated by a numerical study using known distributions over a range of sample sizes from 100 to 10 6 for two to six variables. Performance in terms of speed and accuracy is compared to KDE. The multivariate density estimate developed here tends to perform better as the number of samples and/or variables increases. As an example application, measurements are analyzed over five filters of photometric data from the Sloan Digital Sky Survey Data Release 17. The multivariate estimation is used to form the basis for a binary classifier that distinguishes quasars from galaxies and stars with up to 94% accuracy.
Keywords: nonparametric multivariate density estimation; conditional probability product array; maximum entropy method; ordered statistics; probability density based binary classification; ROC; SDSS-DR17; quasar; galaxies; stars (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/1/155/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/1/155/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2022:i:1:p:155-:d:1017989
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().