OCTNet: A Modified Multi-Scale Attention Feature Fusion Network with InceptionV3 for Retinal OCT Image Classification

Khalil, Irshad; Mehmood, Asif; Kim, Hyunchul; Kim, Jungsuk

OCTNet: A Modified Multi-Scale Attention Feature Fusion Network with InceptionV3 for Retinal OCT Image Classification

Irshad Khalil, Asif Mehmood, Hyunchul Kim () and Jungsuk Kim ()
Additional contact information
Irshad Khalil: Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Asif Mehmood: Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Hyunchul Kim: School of Information, University of California, 102 South Hall 4600, Berkeley, CA 94720, USA
Jungsuk Kim: Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea

Mathematics, 2024, vol. 12, issue 19, 1-19

Abstract: Classification and identification of eye diseases using Optical Coherence Tomography (OCT) has been a challenging task and a trending research area in recent years. Accurate classification and detection of different diseases are crucial for effective care management and improving vision outcomes. Current detection methods fall into two main categories: traditional methods and deep learning-based approaches. Traditional approaches rely on machine learning for feature extraction, while deep learning methods utilize data-driven classification model training. In recent years, Deep Learning (DL) and Machine Learning (ML) algorithms have become essential tools, particularly in medical image classification, and are widely used to classify and identify various diseases. However, due to the high spatial similarities in OCT images, accurate classification remains a challenging task. In this paper, we introduce a novel model called “OCTNet” that integrates a deep learning model combining InceptionV3 with a modified multi-scale attention-based spatial attention block to enhance model performance. OCTNet employs an InceptionV3 backbone with a fusion of dual attention modules to construct the proposed architecture. The InceptionV3 model generates rich features from images, capturing both local and global aspects, which are then enhanced by utilizing the modified multi-scale spatial attention block, resulting in a significantly improved feature map. To evaluate the model’s performance, we utilized two state-of-the-art (SOTA) datasets that include images of normal cases, Choroidal Neovascularization (CNV), Drusen, and Diabetic Macular Edema (DME). Through experimentation and simulation, the proposed OCTNet improves the classification accuracy of the InceptionV3 model by 1.3%, yielding higher accuracy than other SOTA models. We also performed an ablation study to demonstrate the effectiveness of the proposed method. The model achieved an overall average accuracy of 99.50% and 99.65% with two different OCT datasets.

Keywords: OCTNet; retinal OCT; Choroidal Neovascularization; diabetic macular edema; Drusen; retinal disease classification; feature fusion; multi-scale attention mechanism (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/19/3003/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/19/3003/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:19:p:3003-:d:1486628

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().