Accounting fraud detection using contextual language learning

Open Access
Authors
Publication date 06-2024
Journal International Journal of Accounting Information Systems
Article number 100682
Volume | Issue number 53
Number of pages 17
Organisations
  • Faculty of Economics and Business (FEB) - Amsterdam Business School Research Institute (ABS-RI)
Abstract
Accounting fraud is a widespread problem that causes significant damage in the economic market. Detection and investigation of fraudulent firms require a large amount of time, money, and effort for corporate monitors and regulators. In this study, we explore how textual contents from financial reports help in detecting accounting fraud. Pre-trained contextual language learning models, such as BERT, have significantly advanced natural language processing in recent years. We fine-tune the BERT model on Management Discussion and Analysis (MD&A) sections of annual 10-K reports from the Securities and Exchange Commission (SEC) database. Our final model outperforms the textual benchmark model and the quantitative benchmark model from the previous literature by 15% and 12%, respectively. Further, our model identifies five times more fraudulent firm-year observations than the textual benchmark by investigating the same number of firms, and three times more than the quantitative benchmark. Optimizing this investigation process, where more fraudulent observations are detected in the same size of the investigation sample, would be of great economic significance for regulators, investors, financial analysts, and auditors.

Document type Article
Note With supplementary file
Language English
Published at https://doi.org/10.1016/j.accinf.2024.100682
Downloads
1-s2.0-S1467089524000150-main (Final published version)
Supplementary materials
Permalink to this page
Back