Data mining journal entries for fraud detection: An exploratory study
Roger S. Debreceny and
Glen L. Gray
International Journal of Accounting Information Systems, 2010, vol. 11, issue 3, 157-181
Abstract:
Fraud detection has become a critical component of financial audits and audit standards have heightened emphasis on journal entries as part of fraud detection. This paper canvasses perspectives on applying data mining techniques to journal entries. In the past, the impediment to researching journal entry data mining is getting access to journal entry data sets, which may explain why the published research in this area is a null set. For this project, we had access to journal entry data sets for 29 different organizations. Our initial exploratory test of the data sets had interesting preliminary findings. (1) For all 29 entities, the distribution of first digits of journal dollar amounts differed from that expected by Benford's Law. (2) Regarding last digits, unlike first digits, which are expected to have a logarithmic distribution, the last digits would be expected to have a uniform distribution. Our test found that the distribution was not uniform for many of the entities. In fact, eight entities had one number whose frequency was three times more than expected. (3) We compared the number of accounts related to the top five most frequently occurring three last digit combinations. Four entities had a very high occurrences of the most frequent three digit combinations that involved only a small set of accounts, one entity had a low occurrences of the most frequent three digit combination that involved a large set of accounts and 24 had a low occurrences of the most frequent three digit combinations that involved a small set of accounts. In general, the first four entities would probably pose the highest risk of fraud because it could indicate that the fraudster is covering up or falsifying a particular class of transactions. In the future, we will apply more data mining techniques to discover other patterns and relationships in the data sets. We also want to seed the dataset with fraud indicators (e.g., pairs of accounts that would not be expected in a journal entry) and compare the sensitivity of the different data mining techniques to find these seeded indicators.
Keywords: Fraud; Journal entries; Data mining; Auditing; Accounting information systems; XBRL GL (search for similar items in EconPapers)
Date: 2010
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (19)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1467089510000540
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:ijoais:v:11:y:2010:i:3:p:157-181
DOI: 10.1016/j.accinf.2010.08.001
Access Statistics for this article
International Journal of Accounting Information Systems is currently edited by S.V. Grabski
More articles in International Journal of Accounting Information Systems from Elsevier
Bibliographic data for series maintained by Catherine Liu ().