A Comparative Evaluation between Convolutional Neural Networks and Vision Transformers for COVID-19 Detection
Saad I. Nafisah,
Ghulam Muhammad (),
M. Shamim Hossain and
Salman A. AlQahtani
Additional contact information
Saad I. Nafisah: Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Ghulam Muhammad: Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
M. Shamim Hossain: Department of Software Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Salman A. AlQahtani: Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
Mathematics, 2023, vol. 11, issue 6, 1-20
Abstract:
Early illness detection enables medical professionals to deliver the best care and increases the likelihood of a full recovery. In this work, we show that computer-aided design (CAD) systems are capable of using chest X-ray (CXR) medical imaging modalities for the identification of respiratory system disorders. At present, the COVID-19 pandemic is the most well-known illness. We propose a system based on explainable artificial intelligence to detect COVID-19 from CXR images by using several cutting-edge convolutional neural network (CNN) models, as well as the Vision of Transformer (ViT) models. The proposed system also visualizes the infected areas of the CXR images. This gives doctors and other medical professionals a second option for supporting their decision. The proposed system uses some preprocessing of the images, which includes the segmentation of the region of interest using a UNet model and rotation augmentation. CNN employs pixel arrays, while ViT divides the image into visual tokens; therefore, one of the objectives is to compare their performance in COVID-19 detection. In the experiments, a publicly available dataset (COVID-QU-Ex) is used. The experimental results show that the performances of the CNN-based models and the ViT-based models are comparable. The best accuracy was 99.82%, obtained by the EfficientNetB7 (CNN-based) model, followed by the SegFormer (ViT-based). In addition, the segmentation and augmentation enhanced the performance.
Keywords: COVID-19; chest X-ray; convolutional neural network; vision transformer; artificial intelligence (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2227-7390/11/6/1489/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/6/1489/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:6:p:1489-:d:1100898
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().