EconPapers    
Economics at your fingertips  
 

Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore

Pei San Ang (), Desmond Chun Hwee Teo, Sreemanee Raaj Dorajoo, Mukundaram Prem Kumar, Yi Hao Chan, Chih Tzer Choong, Doris Sock Tin Phuah, Dorothy Hooi Myn Tan, Filina Meixuan Tan, Huilin Huang, Maggie Siok Hwee Tan, Michelle Sau Yuen Ng and Jalene Wang Woon Poh
Additional contact information
Pei San Ang: Health Sciences Authority
Desmond Chun Hwee Teo: Health Sciences Authority
Sreemanee Raaj Dorajoo: Health Sciences Authority
Mukundaram Prem Kumar: Health Sciences Authority
Yi Hao Chan: Health Sciences Authority
Chih Tzer Choong: Health Sciences Authority
Doris Sock Tin Phuah: Health Sciences Authority
Dorothy Hooi Myn Tan: Health Sciences Authority
Filina Meixuan Tan: Health Sciences Authority
Huilin Huang: Health Sciences Authority
Maggie Siok Hwee Tan: Health Sciences Authority
Michelle Sau Yuen Ng: Health Sciences Authority
Jalene Wang Woon Poh: Health Sciences Authority

Drug Safety, 2021, vol. 44, issue 9, No 3, 939-948

Abstract: Abstract Introduction Substandard medicines are medicines that fail to meet their quality standards and/or specifications. Substandard medicines can lead to serious safety issues affecting public health. With the increasing number of pharmaceuticals and the complexity of the pharmaceutical manufacturing supply chain, monitoring for substandard medicines via manual environmental scanning can be laborious and time consuming. Methods A web crawler was developed to automatically detect and extract alerts on substandard medicines published on the Internet by regulatory agencies. The crawled data were labelled as related to substandard medicines or not. An expert-derived keyword-based classification algorithm was compared against machine learning algorithms to identify substandard medicine alerts on two validation datasets (n = 4920 and n = 2458) from a later time period than training data. Models were comparatively assessed for recall, precision and their F1 scores (harmonic mean of precision and recall). Results The web crawler routinely extracted alerts from the 46 web pages belonging to nine regulatory agencies. From October 2019 to May 2020, 12,156 unique alerts were crawled of which 7378 (60.7%) alerts were set aside for validation and contained 1160 substandard medicine alerts (15.7%). An ensemble approach of combining machine learning and keywords achieved the best recall (94% and 97%), precision (85% and 80%) and F1 scores (89% and 88%) on temporal validation. Conclusions Combining robust web crawler programmes with rigorously tested filtering algorithms based on machine learning and keyword models can automate and expand horizon scanning capabilities for issues relating to substandard medicines.

Date: 2021
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s40264-021-01084-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:drugsa:v:44:y:2021:i:9:d:10.1007_s40264-021-01084-w

Ordering information: This journal article can be ordered from
http://www.springer.com/adis/journal/40264

DOI: 10.1007/s40264-021-01084-w

Access Statistics for this article

Drug Safety is currently edited by Nitin Joshi

More articles in Drug Safety from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:drugsa:v:44:y:2021:i:9:d:10.1007_s40264-021-01084-w