EconPapers    
Economics at your fingertips  
 

Interpolation of non-random missing values in financial statements’ big data using CatBoost

Shouji Fujimoto (), Takayuki Mizuno () and Atushi Ishikawa ()
Additional contact information
Shouji Fujimoto: Kanazawa Gakuin University
Takayuki Mizuno: National Institute of Informatics
Atushi Ishikawa: Kanazawa Gakuin University

Journal of Computational Social Science, 2022, vol. 5, issue 2, No 7, 1301 pages

Abstract: Abstract Financial statements’ big data have the characteristics of “Incompleteness” and “Nonrepresentative”. In this paper, employing the world’s largest commercial database on finance, ORBIS, we first find that the rate of missing data varies depending on the country, the type and size of financial items, and the year. Using information on missing data, we interpolate non-random missing financial variables from the previous- and/or next-year values of the same financial item, the values of other financial items, and the conditions of missing values determined by CatBoost. Because the distribution of financial values obeys Zipf’s law in the large-scale range and mean and variance diverge, we employ an inverse hyperbolic function to convert the value of a financial item as a target variable. We introduce two types of missing interpolation models according to the two types of situations involving missing objective variables. After verifying the accuracies and stabilities of these models, we describe the properties of firm-scale variables in which non-random missing values are interpolated. In the final stage of this work, we combine these two models. From our observations, we confirm that the range in which Zipf’s law is established becomes wider than before interpolation.

Keywords: Interpolation; Non-random missing; CatBoost; Big data; Firm financials; Machine learning (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://link.springer.com/10.1007/s42001-022-00165-9 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jcsosc:v:5:y:2022:i:2:d:10.1007_s42001-022-00165-9

Ordering information: This journal article can be ordered from
http://www.springer. ... iences/journal/42001

DOI: 10.1007/s42001-022-00165-9

Access Statistics for this article

Journal of Computational Social Science is currently edited by Takashi Kamihigashi

More articles in Journal of Computational Social Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jcsosc:v:5:y:2022:i:2:d:10.1007_s42001-022-00165-9