Combining Prior Knowledge and Reinforcement Learning for Parallel Telescopic-Legged Bipedal Robot Walking
Jie Xue,
Jiaqi Huangfu,
Yunfeng Hou () and
Haiming Mou
Additional contact information
Jie Xue: School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Jiaqi Huangfu: Institute of Machine Intelligence, University of Shanghai for Science and Technology, Shanghai 200093, China
Yunfeng Hou: Institute of Machine Intelligence, University of Shanghai for Science and Technology, Shanghai 200093, China
Haiming Mou: Institute of Machine Intelligence, University of Shanghai for Science and Technology, Shanghai 200093, China
Mathematics, 2025, vol. 13, issue 6, 1-17
Abstract:
The parallel dual-slider telescopic leg bipedal robot (L04) is characterized by its simple structure and low leg rotational inertia, which contribute to its walking efficiency. However, end-to-end methods often overlook the robot’s physical structure, leading to difficulties in maintaining the parallel alignment of the dual sliders, which in turn compromises walking stability. One potential solution to this issue involves utilizing imitation learning to replicate human motion data. However, the dual telescopic leg structure of the L04 robot makes it difficult to perform motion retargeting of human motion data. To enable L04 walking, we design a method that integrates prior feedforward with reinforcement learning (PFRL), specifically tailored for the parallel dual-slider structure. We utilize prior knowledge as a feedforward action to compensate for system nonlinearities; meanwhile, the feedback action generated by the policy network adaptively regulates dynamic balance and, combined with the feedforward action, jointly controls the robot’s walking. PFRL enforces constraints within the motion space to mitigate the chaotic behavior of the parallel dual sliders. Experimental results show that our method successfully achieves sim2real transfer on a real bipedal robot without the need for domain randomization techniques or intricate reward functions. L04 achieves omnidirectional walking with minimal energy consumption and exhibits robustness against external disturbances.
Keywords: parallel dual-slider structure; bipedal robot; prior knowledge; reinforcement learning; sim2real (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/6/979/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/6/979/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:6:p:979-:d:1613593
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().