ADE Eval: An Evaluation of Text Processing Systems for Adverse Event Extraction from Drug Labels for Pharmacovigilance
Samuel Bayer (),
Cheryl Clark (),
Oanh Dang (),
John Aberdeen (),
Sonja Brajovic (),
Kimberley Swank (),
Lynette Hirschman () and
Robert Ball ()
Additional contact information
Samuel Bayer: The MITRE Corporation
Cheryl Clark: The MITRE Corporation
Oanh Dang: US Food and Drug Administration
John Aberdeen: The MITRE Corporation
Sonja Brajovic: US Food and Drug Administration
Kimberley Swank: US Food and Drug Administration
Lynette Hirschman: The MITRE Corporation
Robert Ball: US Food and Drug Administration
Drug Safety, 2021, vol. 44, issue 1, No 9, 83-94
Abstract:
Abstract Introduction The US FDA is interested in a tool that would enable pharmacovigilance safety evaluators to automate the identification of adverse drug events (ADEs) mentioned in FDA prescribing information. The MITRE Corporation (MITRE) and the FDA organized a shared task—Adverse Drug Event Evaluation (ADE Eval)—to determine whether the performance of algorithms currently used for natural language processing (NLP) might be good enough for real-world use. Objective ADE Eval was conducted to evaluate a range of NLP techniques for identifying ADEs mentioned in publicly available FDA-approved drug labels (package inserts). It was designed specifically to reflect pharmacovigilance practices within the FDA and model possible pharmacovigilance use cases. Methods Pharmacovigilance-specific annotation guidelines and annotated corpora were created. Two metrics modeled the experiences of FDA safety evaluators: one measured the ability of an algorithm to identify correct Medical Dictionary for Regulatory Activities (MedDRA®) terms for the text from the annotated corpora, and the other assessed the quality of evidence extracted from the corpora to support the selected MedDRA® term by measuring the portion of annotated text an algorithm correctly identified. A third metric assessed the cost of correcting system output for subsequent training (averaged, weighted F1-measure for mention finding). Results In total, 13 teams submitted 23 runs: the top MedDRA® coding F1-measure was 0.79, the top quality score was 0.96, and the top mention-finding F1-measure was 0.89. Conclusion While NLP techniques do not perform at levels that would allow them to be used without intervention, it is now worthwhile exploring making NLP outputs available in human pharmacovigilance workflows.
Date: 2021
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s40264-020-00996-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:drugsa:v:44:y:2021:i:1:d:10.1007_s40264-020-00996-3
Ordering information: This journal article can be ordered from
http://www.springer.com/adis/journal/40264
DOI: 10.1007/s40264-020-00996-3
Access Statistics for this article
Drug Safety is currently edited by Nitin Joshi
More articles in Drug Safety from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().