Web News Data Extraction Technology Based on Text Keywords
Kun Zhang and
Abd E.I.-Baset Hassanien
Complexity, 2021, vol. 2021, 1-11
Abstract:
In order to shorten the time for users to query news on the Internet, this paper studies and designs a network news data extraction technology, which can obtain the main news information through the extraction of news text keywords. Firstly, the TF-IDF keyword extraction algorithm, TextRank keyword extraction algorithm, and LDA keyword extraction algorithm are analyzed to understand the keyword extraction process, and the TF-IDF algorithm is optimized by Zipf’s law. By introducing the idea of model fusion, five schemes based on waterfall fusion and parallel combination fusion are designed, and the effects of the five schemes are verified by experiments. It is found that the designed extraction technology has a good effect on network news data extraction. News keyword extraction has a great application prospect, which can provide the basis for the research fields of news key phrases, news abstracts, and so on.
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2021/5529447.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2021/5529447.xml (application/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:5529447
DOI: 10.1155/2021/5529447
Access Statistics for this article
More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().