Detection of research trends from bibliographical data
Hidenao Abe and
Shusaku Tsumoto
International Journal of Data Mining, Modelling and Management, 2012, vol. 4, issue 3, 255-266
Abstract:
In this paper, we propose a method for detecting temporal linear trends of technical terms based on importance indices. Recent years, electrical documents are published hourly, daily, monthly, annually, and irregularly for each purpose. Although the purposes of each set of documents are not changed, roles of terms and the relationship among them in the documents change temporally. In text mining, importance indices of terms such as simple frequency, document frequency including the terms, and TF-IDF of the terms, play a key role for finding valuable patterns in the documents with cross sectional manner. In order to detect such temporal changes, we combined an automatic term extraction method, importance indices of the extracted terms, and trend identification based on linear regression analysis. After implementing this strategy, our method detected emergent and subsiding linear trends of the extracted terms in a corpus of a research domain. By comparing this method with the existing burst detection method, we discuss the linear trends of terms including the several burst words.
Keywords: text mining; trend detection; term frequency; inverse document frequency; TF-IDF; Jaccard; matching coefficient; linear regression; technical terms; term extraction; burst detection. (search for similar items in EconPapers)
Date: 2012
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.inderscience.com/link.php?id=48107 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ids:ijdmmm:v:4:y:2012:i:3:p:255-266
Access Statistics for this article
More articles in International Journal of Data Mining, Modelling and Management from Inderscience Enterprises Ltd
Bibliographic data for series maintained by Sarah Parker ().