Visual Question Answer System for Skeletal Image Using Radiology Images in the Healthcare Domain Based on Visual and Textual Feature Extraction Techniques

Y.I., Jinesh Melvin; Shrimali, Mukesh; Gawade, Sushopti

Visual Question Answer System for Skeletal Image Using Radiology Images in the Healthcare Domain Based on Visual and Textual Feature Extraction Techniques

Jinesh Melvin Y.I. (), Mukesh Shrimali and Sushopti Gawade
Additional contact information
Jinesh Melvin Y.I.: Pacific Academy of Higher Education and Research University
Mukesh Shrimali: Pacific Academy of Higher Education and Research University
Sushopti Gawade: Pacific Academy of Higher Education and Research University

Annals of Data Science, 2025, vol. 12, issue 3, No 7, 969-990

Abstract: Abstract The Medical Imaging Query Response System is among the most challenging concepts in the medical field. It requires a significant amount of effort to organize and comprehend the various representations of the human body. Additionally, the system needs to be verified by users in the healthcare industry. With the aid of various images, including MRI scans, CT scans, ultrasounds, X-rays, PET-CT scans, and more, it may be possible to identify human health issues. It is anticipated to encourage patient participation and support clinical decision-making. As a result of the use of a number of characteristics that are inadequately matched to medical images and questions, technically, the VQA system in the healthcare domain is more complicated than in the common domain. The challenges were caused by the datasets, approaches, and models used for both visual and textual aspects. This can sometimes make it harder for clinical assistance to provide relevant answers. The proposed system will analyze current models and diagnose the problem in order to improve the medical visual question-answering system for recent datasets. The models that were compared to the model were convolutional neural networks (CNN), deep belief networks (DBN), recurrent neural networks (RNN), long short-term memory networks (LSTM), and bidirectional long short-term memory (BiLSTM). To assess the effectiveness of each model, the following measures should be used: Classification Accuracy, F-Classification, F-Measure, C-False Negative Rate (FNR), C-Positive Predictive Value, C-Precision, C-Recall, C-Sensitivity, and C-True Positive Rate (CTPR) With the objective of improving the performance of any dataset with accuracy and measures for both visual and textual features to get the right answers for given questions, the proposed system helps to recognize how ideal the existing models are and generates new models using the B12 FASTER Recurrent Neural Network (RNN) and Kai-Bi-LSTM. With questions and appropriate answers, the suggested model will assist in extracting the features of imported images and text.

Keywords: VQA; ImageCLEF; Image feature extraction; Textual feature extraction; Preprocessing; Word extracting; Classification (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s40745-024-00553-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:aodasc:v:12:y:2025:i:3:d:10.1007_s40745-024-00553-0

Ordering information: This journal article can be ordered from
https://www.springer ... gement/journal/40745

DOI: 10.1007/s40745-024-00553-0

Access Statistics for this article

Annals of Data Science is currently edited by Yong Shi

More articles in Annals of Data Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().