Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints

Oliveira, Lucas Lopes; Jiang, Xiaorui; Babu, Aryalakshmi Nellippillipathil; Karajagi, Poonam; Daneshkhah, Alireza

Effective Natural Language Processing Algorithms for Early Alerts of Gout Flares from Chief Complaints

Lucas Lopes Oliveira, Xiaorui Jiang (), Aryalakshmi Nellippillipathil Babu, Poonam Karajagi and Alireza Daneshkhah ()
Additional contact information
Lucas Lopes Oliveira: School of Computing, Mathematics and Data Sciences, Coventry University, Coventry CV1 5FB, UK
Xiaorui Jiang: Centre for Computational Sciences and Mathematical Modelling, Coventry University, Coventry CV1 2TT, UK
Aryalakshmi Nellippillipathil Babu: School of Computing, Mathematics and Data Sciences, Coventry University, Coventry CV1 5FB, UK
Poonam Karajagi: School of Computing, Mathematics and Data Sciences, Coventry University, Coventry CV1 5FB, UK
Alireza Daneshkhah: School of Computing, Mathematics and Data Sciences, Coventry University, Coventry CV1 5FB, UK

Forecasting, 2024, vol. 6, issue 1, 1-15

Abstract: Early identification of acute gout is crucial, enabling healthcare professionals to implement targeted interventions for rapid pain relief and preventing disease progression, ensuring improved long-term joint function. In this study, we comprehensively explored the potential early detection of gout flares (GFs) based on nurses’ chief complaint notes in the Emergency Department (ED). Addressing the challenge of identifying GFs prospectively during an ED visit, where documentation is typically minimal, our research focused on employing alternative Natural Language Processing (NLP) techniques to enhance detection accuracy. We investigated GF detection algorithms using both sparse representations by traditional NLP methods and dense encodings by medical domain-specific Large Language Models (LLMs), distinguishing between generative and discriminative models. Three methods were used to alleviate the issue of severe data imbalances, including oversampling, class weights, and focal loss. Extensive empirical studies were performed on the Gout Emergency Department Chief Complaint Corpora. Sparse text representations like tf-idf proved to produce strong performances, achieving F1 scores higher than 0.75. The best deep learning models were RoBERTa-large-PM-M3-Voc and BioGPT, which had the best F1 scores for each dataset, with a 0.8 on the 2019 dataset and a 0.85 F1 score on the 2020 dataset, respectively. We concluded that although discriminative LLMs performed better for this classification task when compared to generative LLMs, a combination of using generative models as feature extractors and employing a support vector machine for classification yielded promising results comparable to those obtained with discriminative models.

Keywords: gout flare; chief complaint; natural language processing; deep learning; large language models (search for similar items in EconPapers)
JEL-codes: A1 B4 C0 C1 C2 C3 C4 C5 C8 M0 Q2 Q3 Q4 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2571-9394/6/1/13/pdf (application/pdf)
https://www.mdpi.com/2571-9394/6/1/13/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jforec:v:6:y:2024:i:1:p:13-238:d:1354410

Access Statistics for this article

Forecasting is currently edited by Ms. Joss Chen

More articles in Forecasting from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().