EconPapers    
Economics at your fingertips  
 

Natural Language Processing to Extract Information from Portuguese-Language Medical Records

Naila Camila da Rocha, Abner Macola Pacheco Barbosa, Yaron Oliveira Schnr, Juliana Machado-Rugolo, Luis Gustavo Modelli de Andrade, José Eduardo Corrente and Liciana Vaz de Arruda Silveira
Additional contact information
Naila Camila da Rocha: Department of Biostatistics, Institute of Biosciences, Universidade Estadual Paulista (UNESP), Botucatu 18618-970, Brazil
Abner Macola Pacheco Barbosa: Medical School, Universidade Estadual Paulista (UNESP), Botucatu 18618-970, Brazil
Yaron Oliveira Schnr: Medical School, Universidade Estadual Paulista (UNESP), Botucatu 18618-970, Brazil
Juliana Machado-Rugolo: Health Technology Assessment Center (Clinical Hospital of the Botucatu Medical School), Botucatu 18618-970, Brazil
Luis Gustavo Modelli de Andrade: Medical School, Universidade Estadual Paulista (UNESP), Botucatu 18618-970, Brazil
José Eduardo Corrente: Research Support Office, Fundação para o Desenvolvimento Médico e Hospitalar (FAMESP), Botucatu 18618-687, Brazil
Liciana Vaz de Arruda Silveira: Department of Biostatistics, Institute of Biosciences, Universidade Estadual Paulista (UNESP), Botucatu 18618-970, Brazil

Data, 2022, vol. 8, issue 1, 1-15

Abstract: Studies that use medical records are often impeded due to the information presented in narrative fields. However, recent studies have used artificial intelligence to extract and process secondary health data from electronic medical records. The aim of this study was to develop a neural network that uses data from unstructured medical records to capture information regarding symptoms, diagnoses, medications, conditions, exams, and treatment. Data from 30,000 medical records of patients hospitalized in the Clinical Hospital of the Botucatu Medical School (HCFMB), São Paulo, Brazil, were obtained, creating a corpus with 1200 clinical texts. A natural language algorithm for text extraction and convolutional neural networks for pattern recognition were used to evaluate the model with goodness-of-fit indices. The results showed good accuracy, considering the complexity of the model, with an F-score of 63.9% and a precision of 72.7%. The patient condition class reached a precision of 90.3% and the medication class reached 87.5%. The proposed neural network will facilitate the detection of relationships between diseases and symptoms and prevalence and incidence, in addition to detecting the identification of clinical conditions, disease evolution, and the effects of prescribed medications.

Keywords: medical records; named entity recognition; neural networks (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2022
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/8/1/11/pdf (application/pdf)
https://www.mdpi.com/2306-5729/8/1/11/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:8:y:2022:i:1:p:11-:d:1018660

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jdataj:v:8:y:2022:i:1:p:11-:d:1018660