Discovering discoveries: Identifying biomedical discoveries using citation contexts
Hung Tseng and
Journal of Informetrics, 2017, vol. 11, issue 1, 46-62
A procedure for identifying discoveries in the biomedical sciences is described that makes use of citation context information, or more precisely citing sentences, drawn from the PubMed Central database. The procedure focuses on use of specific terms in the citing sentences and the joint appearance of cited references. After a manual screening process to remove non-discoveries, a list of over 100 discoveries and their associated articles is compiled and characterized by subject matter and by type of discovery. The phenomenon of multiple discovery is shown to play an important role. The onset and timing of recognition of the articles are studied by comparing the number of citing sentences with and without discovery terms, and show both early onset and delays in recognition. A comparative analysis of the vocabularies of the discovery and non-discovery sentences reveals the types of words and concepts that scientists associate with discoveries. A machine learning application is used to efficiently extend the list. Implications of the findings for understanding the nature and justification of scientific discoveries are discussed.
Keywords: Discovery; Biomedicine; Citation contexts; Citances; Machine learning; Pubmed central (search for similar items in EconPapers)
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (10) Track citations by RSS feed
Downloads: (external link)
Full text for ScienceDirect subscribers only
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:eee:infome:v:11:y:2017:i:1:p:46-62
Access Statistics for this article
Journal of Informetrics is currently edited by Leo Egghe
More articles in Journal of Informetrics from Elsevier
Bibliographic data for series maintained by Haili He ().