EdgeFormer-YOLO: A Lightweight Multi-Attention Framework for Real-Time Red-Fruit Detection in Complex Orchard Environments
Zhiyuan Xu,
Tianjun Luo,
Yinyi Lai,
Yuheng Liu and
Wenbin Kang ()
Additional contact information
Zhiyuan Xu: Department of Mechanical Engineering, Hohai University, Nanjing 211100, China
Tianjun Luo: Department of Mechanical Engineering, Hohai University, Nanjing 211100, China
Yinyi Lai: Department of Mechanical Engineering, Hohai University, Nanjing 211100, China
Yuheng Liu: Department of Data and Systems Engineering, University of Hong Kong, Hong Kong 999077, China
Wenbin Kang: Department of Mechanical Engineering, City University of Hong Kong, Hong Kong 999077, China
Mathematics, 2025, vol. 13, issue 23, 1-21
Abstract:
Accurate and efficient detection of red fruits in complex orchard environments is crucial for the autonomous operation of agricultural harvesting robots. However, existing methods still face challenges such as high false negative rates, poor localization accuracy, and difficulties in edge deployment in real-world scenarios involving occlusion, strong light reflection, and drastic scale changes. To address these issues, this paper proposes a lightweight multi-attention detection framework, EdgeFormer-YOLO. While maintaining the efficiency of the YOLO series’ single-stage detection architecture, it introduces a multi-head self-attention mechanism (MHSA) to enhance the global modeling capability for occluded fruits and employs a hierarchical feature fusion strategy to improve multi-scale detection robustness. To further adapt to the quantitative deployment requirements of edge devices, the model introduces the arsinh activation function, improving numerical stability and convergence speed while maintaining a non-zero gradient. On the red fruit dataset, EdgeFormer-YOLO achieves 95.7% mAP@0.5, a 2.2 percentage point improvement over the YOLOv8n baseline, while maintaining 90.0% precision and 92.5% recall. Furthermore, on the edge GPU, the model achieves an inference speed of 148.78 FPS with a size of 6.35 MB, 3.21 M parameters, and a computational overhead of 4.18 GFLOPs, outperforming some existing mainstream lightweight YOLO variants in both speed and mAP@50. Experimental results demonstrate that EdgeFormer-YOLO possesses comprehensive advantages in real-time performance, robustness, and deployment feasibility in complex orchard environments, providing a viable technical path for agricultural robot vision systems.
Keywords: EdgeFormer-YOLO; red fruit detection; agricultural picking robot; multi-head self-attention; real-time (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/23/3790/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/23/3790/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:23:p:3790-:d:1803236
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().