EconPapers    
Economics at your fingertips  
 

Part of Speech Tagging Using Part of Speech Sequence Graph

Pejman Gholami-Dastgerdi () and Mohammad-Reza Feizi-Derakhshi ()
Additional contact information
Pejman Gholami-Dastgerdi: University of Tabriz
Mohammad-Reza Feizi-Derakhshi: University of Tabriz

Annals of Data Science, 2023, vol. 10, issue 5, No 7, 1328 pages

Abstract: Abstract Part of speech tagging is one of the most fundamental needs of intelligent text processing, which is assigning the most appropriate grammatical category to each word on the text. Hence, provision of a tagger with high accuracy for the Persian language is the major priority of this article. Numerous other methods of POS tagging have already been presented in a way that each one has been applied in taggers to achieve high performance and accuracy. Statistical methods known as a primary technique and one of the most important issues in POS tagging systems is identifying unknown words. This paper investigates all tags that the Maximum Likelihood Estimation method assigns the words existing in the text (including known and unknown) by proposing a graph-based method and correcting them. To do so, a graph is created from the training corpus including the part of speech sequence in the sentences. Then, sentences tagged with Maximum Likelihood Estimation will be corrected by traversing the graph. It should be noted that different methods have been proposed, implemented, and evaluated for tagging using graphs. Next, by investigating pros and cons, a method is proposed which tags the unknown words with the accuracy of 86.84% and the known words with the accuracy of 97.54%. In conclusion, the overall accuracy of the method is calculated as 96.78%, which is an improvement in comparison to the Maximum Likelihood Estimation method and consequently, the graph method shows an acceptable performance in part of speech tagging and is more reliable.

Keywords: Part of speech tagging; Natural language processing; MLE tagger; Data science; Text processing (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s40745-021-00359-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4

Ordering information: This journal article can be ordered from
https://www.springer ... gement/journal/40745

DOI: 10.1007/s40745-021-00359-4

Access Statistics for this article

Annals of Data Science is currently edited by Yong Shi

More articles in Annals of Data Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:aodasc:v:10:y:2023:i:5:d:10.1007_s40745-021-00359-4