Citation mining: Integrating text mining and bibliometrics for research user profiling
Ronald N. Kostoff,
J. Antonio del Río,
James A. Humenik,
Esther Ofilia García and
Ana María Ramírez
Journal of the American Society for Information Science and Technology, 2001, vol. 52, issue 13, 1148-1156
Abstract:
Identifying the users and impact of research is important for research performers, managers, evaluators, and sponsors. It is important to know whether the audience reached is the audience desired. It is useful to understand the technical characteristics of the other research/development/applications impacted by the originating research, and to understand other characteristics (names, organizations, countries) of the users impacted by the research. Because of the many indirect pathways through which fundamental research can impact applications, identifying the user audience and the research impacts can be very complex and time consuming. The purpose of this article is to describe a novel approach for identifying the pathways through which research can impact other research, technology development, and applications, and to identify the technical and infrastructure characteristics of the user population. A novel literature‐based approach was developed to identify the user community and its characteristics. The research performed is characterized by one or more articles accessed by the Science Citation Index (SCI) database, beccause the SCI's citation‐based structure enables the capability to perform citation studies easily. The user community is characterized by the articles in the SCI that cite the original research articles, and that cite the succeeding generations of these articles as well. Text mining is performed on the citing articles to identify the technical areas impacted by the research, the relationships among these technical areas, and relationships among the technical areas and the infrastructure (authors, journals, organizations). A key component of text mining, concept clustering, was used to provide both a taxonomy of the citing articles' technical themes and further technical insights based on theme relationships arising from the grouping process. Bibliometrics is performed on the citing articles to profile the user characteristics. Citation Mining, this integration of citation bibliometrics and text mining, is applied to the 307 first generation citing articles of a fundamental physics article on the dynamics of vibrating sand‐piles. Most of the 307 citing articles were basic research whose main themes were aligned with those of the cited article. However, about 20% of the citing articles were research or development in other disciplines, or development within the same discipline. The text mining alone identified the intradiscipline applications and extradiscipline impacts and applications; this was confirmed by detailed reading of the 307 abstracts. The combination of citation bibliometrics and text mining provides a synergy unavailable with each approach taken independently. Furthermore, text mining is a REQUIREMENT for a feasible comprehensive research impact determination. The integrated multigeneration citation analysis required for broad research impact determination of highly cited articles will produce thousands or tens or hundreds of thousands of citing article Abstracts. Text mining allows the impacts of research on advanced development categories and/or extradiscipline categories to be obtained without having to read all these citing article Abstracts. The multifield bibliometrics provide multiple documented perspectives on the users of the research, and indicate whether the documented audience reached is the desired target audience.
Date: 2001
References: Add references at CitEc
Citations: View citations in EconPapers (15)
Downloads: (external link)
https://doi.org/10.1002/asi.1181
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:52:y:2001:i:13:p:1148-1156
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890
Access Statistics for this article
More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().