Principal Component Analysis in Financial Data Science
Stefana Janicijevic,
Vule Mizdrakovic and
Maja Kljajic
A chapter in Advances in Principal Component Analysis from IntechOpen
Abstract:
Numerous methods exist aimed at examining patterns in structured and unstructured financial data. Applications of these methods include fraud detection, risk management, credit allocation, assessment of the risk of default, customer analytics, trading prediction, and many others, creating a broad field of research named Financial data science. A problem within the field that remains significantly under-researched, yet very important, is that of differentiating between the three major types of business activities--merchandising, manufacturing, and service based on the structured data available in financial reports. It can be argued that, due to the inherent idiosyncrasies of the three types of business activities, methods for assessment of the risk of default, methods for credit allocation, and methods for fraud detection would all see an improved performance if reliable information on the percentage of entities' business activities allocated to the three major activities would be available. To this end, in this paper, we propose a clustering procedure that relies on Principal Component Analysis (PCA) for dimensionality reduction and feature selection. The procedure is presented using a large empirical data set comprising complete financial reports for various business entities operating in the Republic in Serbia, that pertain to the reporting period 2019.
Keywords: data science; principal component analysis; random forest algorithm; financial data; financial reporting (search for similar items in EconPapers)
JEL-codes: C10 (search for similar items in EconPapers)
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.intechopen.com/chapters/80983 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ito:pchaps:257186
DOI: 10.5772/intechopen.102928
Access Statistics for this chapter
More chapters in Chapters from IntechOpen
Bibliographic data for series maintained by Slobodan Momcilovic ().