D 3 -YOLOv10: Improved YOLOv10-Based Lightweight Tomato Detection Algorithm Under Facility Scenario
Ao Li (),
Chunrui Wang,
Tongtong Ji,
Qiyang Wang and
Tianxue Zhang
Additional contact information
Ao Li: School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
Chunrui Wang: School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
Tongtong Ji: School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China
Qiyang Wang: School of Agricultural Engineering, Jiangsu University, Zhenjiang 212013, China
Tianxue Zhang: School of Mechanical Engineering and Automation, Beihang University, Beijing 100191, China
Agriculture, 2024, vol. 14, issue 12, 1-18
Abstract:
Accurate and efficient tomato detection is one of the key techniques for intelligent automatic picking in the area of precision agriculture. However, under the facility scenario, existing detection algorithms still have challenging problems such as weak feature extraction ability for occlusion conditions and different fruit sizes, low accuracy on edge location, and heavy model parameters. To address these problems, this paper proposed D 3 -YOLOv10, a lightweight YOLOv10-based detection framework. Initially, a compact dynamic faster network (DyFasterNet) was developed, where multiple adaptive convolution kernels are aggregated to extract local effective features for fruit size adaption. Additionally, the deformable large kernel attention mechanism (D-LKA) was designed for the terminal phase of the neck network by adaptively adjusting the receptive field to focus on irregular tomato deformations and occlusions. Then, to further improve detection boundary accuracy and convergence, a dynamic FM-WIoU regression loss with a scaling factor was proposed. Finally, a knowledge distillation scheme using semantic frequency prompts was developed to optimize the model for lightweight deployment in practical applications. We evaluated the proposed framework using a self-made tomato dataset and designed a two-stage category balancing method based on diffusion models to address the sample class-imbalanced issue. The experimental results demonstrated that the D 3 -YOLOv10 model achieved an m A P 0.5 of 91.8%, with a substantial reduction of 54.0% in parameters and 64.9% in FLOPs, compared to the benchmark model. Meanwhile, the detection speed of 80.1 FPS more effectively meets the demand for real-time tomato detection. This study can effectively contribute to the advancement of smart agriculture research on the detection of fruit targets.
Keywords: tomato detection; YOLOv10; occlusion recognition; attention mechanism; knowledge distillation (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2077-0472/14/12/2268/pdf (application/pdf)
https://www.mdpi.com/2077-0472/14/12/2268/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:14:y:2024:i:12:p:2268-:d:1541217
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().