EconPapers    
Economics at your fingertips  
 

Manual versus machine: An evaluation of the performance of the Medical Text Indexer (MTI) at classifying different document types by disease area

Duncan A.Q. Moore, Ohid Yaqub and Bhaven N. Sampat

No b75fr, SocArXiv from Center for Open Science

Abstract: The Medical Subject Headings (MeSH) thesaurus, a controlled vocabulary, is increasingly being used by those who study research and innovation. While classification was once purely entirely manual, human indexers are now assisted by algorithmic suggestions in an effort to automate some of the indexing process. A version of this algorithm, the Medical Text Indexer, has been made available, allowing for classification of arbitrary text into MeSH categories. Potentially, this opens up other document classes to MeSH assignment for research and innovation studies. However, it remains unclear how well the MTI, a tool designed to categorize publications for indexing purposes, can be reliably extended to other document classes. To allow for assessment of the MTI’s performance for different classes of documents, we collected text from grant descriptions, patent claims, and drug indications; and compared the MTI’s categorisation to that of a qualified human classifier. We also tested whether MTI performance varied with text length or score thresholding. Our results suggest that researchers can proceed with confidence that the MTI reliably captures the diseases contained in a text (recall), and that its scoring can be used to guard against false diseases in its outputs (precision).

Date: 2023-02-25
New Economics Papers: this item is included in nep-cmp and nep-hea
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://osf.io/download/63f8bbcbbbc5e5027ff804a9/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:b75fr

DOI: 10.31219/osf.io/b75fr

Access Statistics for this paper

More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-03-19
Handle: RePEc:osf:socarx:b75fr