EconPapers    
Economics at your fingertips  
 

A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes

Tang Li, James Burridge, Pieter M. Blok and Wei Guo ()
Additional contact information
Tang Li: Laboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, Japan
James Burridge: Laboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, Japan
Pieter M. Blok: Laboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, Japan
Wei Guo: Laboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, Japan

Agriculture, 2025, vol. 15, issue 2, 1-20

Abstract: Species-level crop and weed semantic segmentation in agricultural field images enables plant identification and enhanced precision weed management. However, the scarcity of labeled data poses significant challenges for model development. Here, we report a patch-level synthetic data generation pipeline that improves semantic segmentation performance in natural agriculture scenes by creating realistic training samples, achieved by pasting patches of segmented plants onto soil backgrounds. This pipeline effectively preserves foreground context and ensures diverse and accurate samples, thereby enhancing model generalization. The semantic segmentation performance of the baseline model was higher when trained solely on data synthesized by our proposed method compared to training solely on real data, with an approximate increase in the mean intersection over union (mIoU) by approximately 1.1% (from 0.626 to 0.633). Building on this, we created hybrid datasets by combining synthetic and real data and investigated the impact of synthetic data volume. By increasing the number of synthetic images in these hybrid datasets from 1× to 20×, we observed a substantially performance improvement, with mIoU increasing by 15% at 15×. However, the gains diminish beyond this point, with the optimal balance between accuracy and efficiency achieved at 10×. These findings highlight synthetic data as a scalable and effective augmentation strategy for addressing the challenges of limited labeled data in agriculture.

Keywords: data synthesis; data augmentation; generative model; semantic segmentation; precision weed management (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2077-0472/15/2/138/pdf (application/pdf)
https://www.mdpi.com/2077-0472/15/2/138/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:15:y:2025:i:2:p:138-:d:1563752

Access Statistics for this article

Agriculture is currently edited by Ms. Leda Xuan

More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jagris:v:15:y:2025:i:2:p:138-:d:1563752