FocusGate-Net: A dual-attention guided MLP-convolution hybrid network for accurate and efficient medical image segmentation

Kılıç, Şafak

FocusGate-Net: A dual-attention guided MLP-convolution hybrid network for accurate and efficient medical image segmentation

Şafak Kılıç

PLOS ONE, 2025, vol. 20, issue 9, 1-23

Abstract: Although recent advances in CNNs and Transformers have significantly improved medical image segmentation, these models often struggle to balance segmentation accuracy, inference speed, and architectural simplicity. Lightweight MLP-based methods have emerged as a promising alternative, but they frequently lack the ability to capture fine-grained spatial context, leading to suboptimal boundary localization. To address this issue, a hybrid architecture can be introduced that integrates the computational efficiency of MLPs with the spatial feature extraction strengths of convolutional or transformer-based modules. This design aims to deliver high segmentation accuracy while preserving low latency and minimal architectural complexity, thereby enhancing applicability in real-time clinical settings. Medical image segmentation remains a challenging task requiring both accuracy and computational efficiency in clinical settings. This paper introduces FocusGate-Net, a novel hybrid architecture combining shifted token MLP blocks, convolutional feature extractors, and dual-attention mechanisms for robust medical image segmentation. Our approach leverages the spatial dependency modeling capabilities of MLP architectures while enhancing feature selectivity through Convolutional Block Attention Module (CBAM) and Attention Gate (AG) mechanisms. We evaluate FocusGate-Net on three diverse medical image datasets: ISIC2018 for skin lesion segmentation, PH2 for dermatoscopic images, and Kvasir-SEG for polyp segmentation. Comprehensive ablation studies verify the contribution of each architectural component, demonstrating the effectiveness of our hybrid design. When benchmarked against state-of-the-art models like UNet, UNet++, and ResUNet, FocusGate-Net achieves superior performance, with a Dice coefficient of 92.47% and IoU of 86.36% on ISIC2018. Furthermore, our model demonstrates exceptional cross-dataset generalization capability, achieving Dice scores of 97.25% on PH2 and 94.83% on Kvasir-SEG. These results highlight the potential of MLP-based hybrid architectures with attention mechanisms for improving medical image segmentation accuracy while maintaining computational efficiency suitable for clinical deployment.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0331896 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 31896&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0331896

DOI: 10.1371/journal.pone.0331896

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().