Visual Question Answer System for Skeletal Image Using Radiology Images in the Healthcare Domain Based on Visual and Textual Feature Extraction Techniques
Jinesh Melvin Y.I. (),
Mukesh Shrimali and
Sushopti Gawade
Additional contact information
Jinesh Melvin Y.I.: Pacific Academy of Higher Education and Research University
Mukesh Shrimali: Pacific Academy of Higher Education and Research University
Sushopti Gawade: Pacific Academy of Higher Education and Research University
Annals of Data Science, 2025, vol. 12, issue 3, No 7, 969-990
Abstract:
Abstract The Medical Imaging Query Response System is among the most challenging concepts in the medical field. It requires a significant amount of effort to organize and comprehend the various representations of the human body. Additionally, the system needs to be verified by users in the healthcare industry. With the aid of various images, including MRI scans, CT scans, ultrasounds, X-rays, PET-CT scans, and more, it may be possible to identify human health issues. It is anticipated to encourage patient participation and support clinical decision-making. As a result of the use of a number of characteristics that are inadequately matched to medical images and questions, technically, the VQA system in the healthcare domain is more complicated than in the common domain. The challenges were caused by the datasets, approaches, and models used for both visual and textual aspects. This can sometimes make it harder for clinical assistance to provide relevant answers. The proposed system will analyze current models and diagnose the problem in order to improve the medical visual question-answering system for recent datasets. The models that were compared to the model were convolutional neural networks (CNN), deep belief networks (DBN), recurrent neural networks (RNN), long short-term memory networks (LSTM), and bidirectional long short-term memory (BiLSTM). To assess the effectiveness of each model, the following measures should be used: Classification Accuracy, F-Classification, F-Measure, C-False Negative Rate (FNR), C-Positive Predictive Value, C-Precision, C-Recall, C-Sensitivity, and C-True Positive Rate (CTPR) With the objective of improving the performance of any dataset with accuracy and measures for both visual and textual features to get the right answers for given questions, the proposed system helps to recognize how ideal the existing models are and generates new models using the B12 FASTER Recurrent Neural Network (RNN) and Kai-Bi-LSTM. With questions and appropriate answers, the suggested model will assist in extracting the features of imported images and text.
Keywords: VQA; ImageCLEF; Image feature extraction; Textual feature extraction; Preprocessing; Word extracting; Classification (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s40745-024-00553-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:aodasc:v:12:y:2025:i:3:d:10.1007_s40745-024-00553-0
Ordering information: This journal article can be ordered from
https://www.springer ... gement/journal/40745
DOI: 10.1007/s40745-024-00553-0
Access Statistics for this article
Annals of Data Science is currently edited by Yong Shi
More articles in Annals of Data Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().