Numerical Simulation of Ambiguity Resolution in Multiple Information Streams Based on Network Machine Translation
Lei Wang and
Qun Ai
Complexity, 2020, vol. 2020, 1-10
Abstract:
In natural language, the phenomenon of polysemy is widespread, which makes it very difficult for machines to process natural language. Word sense disambiguation is a key issue in the field of natural language processing. This paper introduces the more common statistical learning methods used in the field of word sense disambiguation. Using the naive Bayesian machine learning method and the feature vector set extracted and constructed by the Dice coefficient method, a semantic word disambiguation model based on semantics is realized. The results of comparative experiments show that the proposed method is better compared with known systems. This paper proposes a method for disambiguation of word segmentation in professional fields based on unsupervised learning. This method does not rely on professional domain knowledge and training corpus and only uses the frequency, mutual information, and boundary entropy information of the string in the test corpus to solve the problem of word segmentation ambiguity. The experimental results show that these three evaluation standards can solve the problem of word segmentation ambiguity in professional fields and improve the effect of word segmentation. Among them, the segmentation result using mutual information is the best, and the performance is stable.
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/8503/2020/7278085.pdf (application/pdf)
http://downloads.hindawi.com/journals/8503/2020/7278085.xml (text/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:7278085
DOI: 10.1155/2020/7278085
Access Statistics for this article
More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().