Deep Learning application for fraud detection in financial statements

Craja, Patricia; Kim, Alisa; Lessmann, Stefan

Deep Learning application for fraud detection in financial statements

Patricia Craja, Alisa Kim and Stefan Lessmann

No 2020-007, IRTG 1792 Discussion Papers from Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series"

Abstract: Financial statement fraud is an area of significant consternation for potential investors, auditing companies, and state regulators. Intelligent systems facilitate detecting financial statement fraud and assist the decision-making of relevant stakeholders. Previous research detected instances in which financial statements have been fraudulently misrepresented in managerial comments. The paper aims to investigate whether it is possible to develop an enhanced system for detecting financial fraud through the combination of information sourced from financial ratios and managerial comments within corporate annual reports. We employ a hierarchical attention network (HAN) with a long short-term memory (LSTM) encoder to extract text features from the Management Discussion and Analysis (MD&A) section of annual reports. The model is designed to offer two distinct features. First, it reflects the structured hierarchy of documents, which previous models were unable to capture. Second, the model embodies two different attention mechanisms at the word and sentence level, which allows content to be differentiated in terms of its importance in the process of constructing the document representation. As a result of its architecture, the model captures both content and context of managerial comments, which serve as supplementary predictors to financial ratios in the detection of fraudulent reporting. Additionally, the model provides interpretable indicators denoted as “red-flag” sentences, which assist stakeholders in their process of determining whether further investigation of a specific annual report is required. Empirical results demonstrate that textual features of MD&A sections extracted by HAN yield promising classification results and substantially reinforce financial ratios.

Keywords: fraud detection; financial statements; deep learning; text analytics (search for similar items in EconPapers)
JEL-codes: C00 (search for similar items in EconPapers)
Date: 2020
New Economics Papers: this item is included in nep-acc, nep-big and nep-fmk
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (12)

Downloads: (external link)
https://www.econstor.eu/bitstream/10419/230813/1/irtg1792dp2020-007.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:zbw:irtgdp:2020007

Access Statistics for this paper

More papers in IRTG 1792 Discussion Papers from Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series" Contact information at EDIRC.
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().