EconPapers    
Economics at your fingertips  
 

Corporate fraud detection based on linguistic readability vector: Application to financial companies in China

Yi Zhang, Tianxiang Liu and Weiping Li

International Review of Financial Analysis, 2024, vol. 95, issue PB

Abstract: Existing research on corporate fraud identification mainly uses text data disclosed by companies to construct models. However, the semantic text information is lost after vectorizing text data using natural language processing methods. Based on the linguistic features of Chinese texts, we construct a new Chinese character-level readability index, a Chinese word-level readability index, a Chinese sentence-level readability index, and a Chinese paragraph-level readability index, and consider them together to define for the first time linguistic readability vectors of Chinese text. This paper takes A-share companies in the financial industry listed on the Shanghai and Shenzhen stock exchanges from 2005 to 2019 as the research object, and uses the natural language processing method, Word2Vec, to vectorize management's discussion and analysis (MD&A) of the company's annual reports. We then use machine learning algorithms to construct fraud identification models by using the readability vector data to complement the MD&A semantically. The empirical results show that the performance of all three types of machine learning models improves after supplementing with the semantic information of the readability vector, with the support vector machine improving the most significantly, with 31.17%, 2.56%, 26.33%, and 2.45% improvement in accuracy, recall, F1-score, and AUC, respectively. This not only enriches the semantic interpretation of Chinese annual reports but also improves the empirical effectiveness of fraud recognition models.

Keywords: Text readability vector; Machine learning; Word2vec; Fraud identification; Auditing (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1057521924003375
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:finana:v:95:y:2024:i:pb:s1057521924003375

DOI: 10.1016/j.irfa.2024.103405

Access Statistics for this article

International Review of Financial Analysis is currently edited by B.M. Lucey

More articles in International Review of Financial Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:finana:v:95:y:2024:i:pb:s1057521924003375