Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery
Xijian Fan,
Chunlei Ge,
Xubing Yang and
Weice Wang ()
Additional contact information
Xijian Fan: College of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, China
Chunlei Ge: College of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, China
Xubing Yang: College of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, China
Weice Wang: Fujian Key Laboratory of Spatial Information Perception and Intelligent Processing, Yango University, Fuzhou 350015, China
Agriculture, 2024, vol. 14, issue 12, 1-16
Abstract:
The accurate mapping of weeds in agricultural fields is essential for effective weed control and enhanced crop productivity. Moving beyond the limitations of RGB imagery alone, this study presents a cross-modal feature fusion network (CMFNet) designed for precise weed mapping by integrating RGB and near-infrared (NIR) imagery. CMFNet first applies color space enhancement and adaptive histogram equalization to improve the image brightness and contrast in both RGB and NIR images. Building on a Transformer-based segmentation framework, a cross-modal multi-scale feature enhancement module is then introduced, featuring spatial and channel feature interaction to automatically capture complementary information across two modalities. The enhanced features are further fused and refined by integrating an attention mechanism, which reduces the background interference and enhances the segmentation accuracy. Extensive experiments conducted on two public datasets, the Sugar Beets 2016 and Sunflower datasets, demonstrate that CMFNet significantly outperforms CNN-based segmentation models in the task of weed and crop segmentation. The model achieved an Intersection over Union (IoU) metric of 90.86% and 90.77%, along with a Mean Accuracy (mAcc) of 93.8% and 94.35%, respectively. Ablation studies further validate that the proposed cross-modal fusion method provides substantial improvements over basic feature fusion methods, effectively localizing weed and crop regions across diverse field conditions. These findings underscore their potential as a robust solution for precise and adaptive weed mapping in complex agricultural landscapes.
Keywords: weed mapping; multi-modal fusion; self-attention; semantic segmentation (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/2077-0472/14/12/2331/pdf (application/pdf)
https://www.mdpi.com/2077-0472/14/12/2331/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:14:y:2024:i:12:p:2331-:d:1547679
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().