Mask-Space Optimized Transformer for Semantic Segmentation of Lithium Battery Surface Defect Images
Daozong Sun,
Jiasi Chen,
Peiwen Wu,
Yucheng Pan,
Hongsheng Zhong,
Zihao Deng and
Xiuyun Xue ()
Additional contact information
Daozong Sun: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Jiasi Chen: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Peiwen Wu: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Yucheng Pan: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Hongsheng Zhong: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Zihao Deng: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Xiuyun Xue: College of Electronic Engineering (College of AI), South China Agricultural University, Guangzhou 510642, China
Mathematics, 2024, vol. 12, issue 22, 1-23
Abstract:
The segmentation of surface defects in lithium batteries is crucial for enhancing the overall quality of the production process. However, the severe foreground–background imbalance in surface images of lithium batteries, along with the irregular shapes and random distribution of foreground regions, poses significant challenges for defect segmentation. Based on these observations, this paper focuses on the separation of foreground and background in surface defect images of lithium batteries and proposes a novel Mask Space Optimization Transformer (MSOFormer) for semantic segmentation of these images. Specifically, the Mask Boundary Loss (MBL) module in our model provides more efficient supervision during training to enhance the accuracy of the mask computation within the mask attention mechanism, thereby improving the model’s performance in separating foreground and background. Additionally, the Dynamic Spatial Query (DSQ) module allocates spatial information of the image to each query, enhancing the model’s sensitivity to the positions of small foreground targets in various scenes. The Efficient Pixel Decoder (EPD) ensures deformable receptive fields for irregularly shaped foregrounds while further improving the model’s performance and efficiency. Experimental results demonstrate that our method outperforms other state-of-the-art methods in terms of mean Intersection over Union (mIoU). Specifically, our approach achieves an mIoU of 84.18% on the lithium battery surface defect test set and 85.53% and 87.05% mIoUs on two publicly available defect test sets with similar defect characteristics to lithium batteries.
Keywords: deep learning; surface defect detection; mask classification; attention mechanism; transformer (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/22/3627/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/22/3627/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:22:p:3627-:d:1525418
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().