Dynamic Task Planning for Multi-Arm Apple-Harvesting Robots Using LSTM-PPO Reinforcement Learning Algorithm
Zhengwei Guo,
Heng Fu,
Jiahao Wu,
Wenkai Han,
Wenlei Huang,
Wengang Zheng () and
Tao Li ()
Additional contact information
Zhengwei Guo: School of Mechanical Engineering, Guangxi University, Nanning 530004, China
Heng Fu: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Jiahao Wu: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Wenkai Han: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Wenlei Huang: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Wengang Zheng: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Tao Li: Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
Agriculture, 2025, vol. 15, issue 6, 1-20
Abstract:
This paper presents a dynamic task planning approach for multi-arm apple-picking robots based on a deep reinforcement learning (DRL) framework incorporating Long Short-Term Memory (LSTM) networks and Proximal Policy Optimization (PPO). In the context of rising labor costs and labor shortages in agriculture, automated apple harvesting is becoming increasingly important. The proposed algorithm addresses key challenges such as efficient task coordination, optimal picking sequences, and real-time decision-making in complex, dynamic orchard environments. The system’s performance is validated through simulations in both static and dynamic environments, with the algorithm demonstrating significant improvements in task completion time and robot efficiency compared to existing strategies. The results show that the LSTM-PPO approach outperforms other methods, offering enhanced adaptability, fault tolerance, and task execution efficiency, particularly under changing and unpredictable conditions. This research lays the foundation for the development of more efficient, adaptable robotic systems in agricultural applications.
Keywords: deep reinforcement learning; multi-arm harvesting robot; PPO; dynamic task planning (search for similar items in EconPapers)
JEL-codes: Q1 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2077-0472/15/6/588/pdf (application/pdf)
https://www.mdpi.com/2077-0472/15/6/588/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jagris:v:15:y:2025:i:6:p:588-:d:1609353
Access Statistics for this article
Agriculture is currently edited by Ms. Leda Xuan
More articles in Agriculture from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().