DMET: Dynamic Mask-Enhanced Transformer for Generalizable Deep Image Denoising

Zhu, Tong; Li, Anqi; Wang, Yuan-Gen; Su, Wenkang; Jiang, Donghua

DMET: Dynamic Mask-Enhanced Transformer for Generalizable Deep Image Denoising

Tong Zhu, Anqi Li, Yuan-Gen Wang (), Wenkang Su () and Donghua Jiang
Additional contact information
Tong Zhu: School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China
Anqi Li: School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China
Yuan-Gen Wang: School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China
Wenkang Su: School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China
Donghua Jiang: School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 511400, China

Mathematics, 2025, vol. 13, issue 13, 1-16

Abstract: Different types of noise are inevitably introduced by devices during image acquisition and transmission processes. Therefore, image denoising remains a crucial challenge in computer vision. Deep learning, especially recent Transformer-based architectures, has demonstrated remarkable performance for image denoising tasks. However, due to its data-driven nature, deep learning can easily overfit the training data, leading to a lack of generalization ability. In order to address this issue, we present a novel Dynamic Mask-Enhanced Transformer (DMET) to improve the generalization capacity of denoising networks. Specifically, a texture-guided adaptive masking mechanism is introduced to simulate possible noise in practical applications. Then, we apply a masked hierarchical attention block to mitigate information loss and leverage global statistics, which combines shifted window multi-head self-attention with channel attention. Additionally, an attention mask is applied during training to reduce discrepancies between training and testing. Extensive experiments demonstrate that our approach achieves better generalization performance than state-of-the-art deep learning models and can be directly applied to real-world scenarios.

Keywords: image denoising; masked training; channel attention (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/13/2167/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/13/2167/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:13:p:2167-:d:1693444

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().