EconPapers    
Economics at your fingertips  
 

Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety

Felix Krieger (), Paul Drews (), Burkhardt Funk () and Till Wobbe ()
Additional contact information
Felix Krieger: Leuphana Universität
Paul Drews: Leuphana Universität
Burkhardt Funk: Leuphana Universität
Till Wobbe: EY, GSA Assurance Research and Development

A chapter in Innovation Through Information Systems, 2021, pp 5-20 from Springer

Abstract: Abstract Extracting information from invoices is a highly structured, recurrent task in auditing. Automating this task would yield efficiency improvements, while simultaneously improving audit quality. The challenge for this endeavor is to account for the text layout on invoices and the high variety of layouts across different issuers. Recent research has proposed graphs to structurally represent the layout on invoices and to apply graph convolutional networks to extract the information pieces of interest. However, the effectiveness of graph-based approaches has so far been shown only on datasets with a low variety of invoice layouts. In this paper, we introduce a graph-based approach to information extraction from invoices and apply it to a dataset of invoices from multiple vendors. We show that our proposed model extracts the specified key items from a highly diverse set of invoices with a macro $${F}_{1}$$ F 1 score of 0.8753.

Keywords: Graph attention networks; Unstructured data; Audit digitization; Graph-based machine learning (search for similar items in EconPapers)
Date: 2021
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lnichp:978-3-030-86797-3_1

Ordering information: This item can be ordered from
http://www.springer.com/9783030867973

DOI: 10.1007/978-3-030-86797-3_1

Access Statistics for this chapter

More chapters in Lecture Notes in Information Systems and Organization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-01
Handle: RePEc:spr:lnichp:978-3-030-86797-3_1