EconPapers    
Economics at your fingertips  
 

Semantic Indexing of 19th-Century Greek Literature Using 21st-Century Linguistic Resources

Dimitris Dimitriadis, Sofia Zapounidou and Grigorios Tsoumakas
Additional contact information
Dimitris Dimitriadis: School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
Sofia Zapounidou: Library and Information Centre, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
Grigorios Tsoumakas: School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

Sustainability, 2021, vol. 13, issue 16, 1-16

Abstract: Manual classification of works of literature with genre/form concepts is a time-consuming task requiring domain expertise. Building automated systems based on language understanding can help humans to achieve this work faster and more consistently. Towards this direction, we present a case study on automatic classification of Greek literature books of the 19th century. The main challenges in this problem are the limited number of literature books and resources of that age and the quality of the source text. We propose an automated classification system based on the Bidirectional Encoder Representations from Transformers (BERT) model trained on books from the 20th and 21st century. We also dealt with BERT’s constraint on the maximum sequence length of the input, leveraging the TextRank algorithm to construct representative sentences or phrases from each book. The results show that BERT trained on recent literature books correctly classifies most of the books of the 19th century despite the disparity between the two collections. Additionally, the TextRank algorithm improves the performance of BERT.

Keywords: semantic indexing; text classification; Greek literature; TextRank; BERT (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/13/16/8878/pdf (application/pdf)
https://www.mdpi.com/2071-1050/13/16/8878/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:13:y:2021:i:16:p:8878-:d:610776

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jsusta:v:13:y:2021:i:16:p:8878-:d:610776