RT-DETR-MCDAF: Multimodal Fusion of Visible Light and Near-Infrared Images for Citrus Surface Defect Detection in the Compound Domain
Jingxi Luo,
Zhanwei Yang,
Ying Cao,
Tao Wen and
Dapeng Li ()
Additional contact information
Jingxi Luo: Bangor College China, Central South University of Forestry and Technology, Changsha 410004, China
Zhanwei Yang: College of Mechanical and Intelligent Manufacturing, Central South University of Forestry and Technology, Changsha 410004, China
Ying Cao: College of Mechanical and Intelligent Manufacturing, Central South University of Forestry and Technology, Changsha 410004, China
Tao Wen: College of Mechanical and Intelligent Manufacturing, Central South University of Forestry and Technology, Changsha 410004, China
Dapeng Li: College of Mechanical and Intelligent Manufacturing, Central South University of Forestry and Technology, Changsha 410004, China
Agriculture, 2025, vol. 15, issue 6, 1-21
Abstract:
The accurate detection of citrus surface defects is essential for automated citrus sorting to enhance the commercialization of the citrus industry. However, previous studies have only focused on single-modal defect detection using visible light images (RGB) or near-infrared light images (NIR), without considering the feature fusion between these two modalities. This study proposed an RGB-NIR multimodal fusion method to extract and integrate key features from both modalities to enhance defect detection performance. First, an RGB-NIR multimodal dataset containing four types of citrus surface defects (cankers, pests, melanoses, and cracks) was constructed. Second, a Multimodal Compound Domain Attention Fusion (MCDAF) module was developed for multimodal channel fusion. Finally, MCDAF was integrated into the feature extraction network of Real-Time DEtection TRansformer (RT-DETR). The experimental results demonstrated that RT-DETR-MCDAF achieved Precision, Recall, mAP@0.5, and mAP@0.5:0.95 values of 0.914, 0.919, 0.90, and 0.937, respectively, with an average detection performance of 0.598. Compared with the model RT-DETR-RGB&NIR, which used simple channel concatenation fusion, RT-DETR-MCDAF improved the performance by 1.3%, 1.7%, 1%, 1.5%, and 1.7%, respectively. Overall, the proposed model outperformed traditional channel fusion methods and state-of-the-art single-modal models, providing innovative insights for commercial citrus sorting.
Keywords: citrus defect detection; RGB-NIR multimodal fusion; RT-DETR; MCDAF; object detection (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2077-0472/15/6/630/pdf (application/pdf)
https://www.mdpi.com/2077-0472/15/6/630/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:15:y:2025:i:6:p:630-:d:1613713
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().