Adaptive PPO-RND Optimization Within Prescribed Performance Control for High-Precision Motion Platforms

Wang, Yimin; Xu, Jingchong; Gao, Kaina; Wang, Junjie; Bu, Shi; Liu, Bin; Xing, Jianping

Adaptive PPO-RND Optimization Within Prescribed Performance Control for High-Precision Motion Platforms

Yimin Wang, Jingchong Xu, Kaina Gao, Junjie Wang, Shi Bu, Bin Liu and Jianping Xing ()
Additional contact information
Yimin Wang: School of Integrated Circuits, Shandong University, Jinan 250101, China
Jingchong Xu: 45th Research Institute of China Electronics Technology Group Corporation, Beijing 100176, China
Kaina Gao: 45th Research Institute of China Electronics Technology Group Corporation, Beijing 100176, China
Junjie Wang: 45th Research Institute of China Electronics Technology Group Corporation, Beijing 100176, China
Shi Bu: 45th Research Institute of China Electronics Technology Group Corporation, Beijing 100176, China
Bin Liu: 45th Research Institute of China Electronics Technology Group Corporation, Beijing 100176, China
Jianping Xing: School of Integrated Circuits, Shandong University, Jinan 250101, China

Mathematics, 2025, vol. 13, issue 21, 1-18

Abstract: The continuous reduction in critical dimensions and the escalating demands for higher throughput are driving motion platforms to operate under increasingly complex conditions, including multi-axis coupling, structural nonlinearities, and time-varying operational scenarios. These complexities make the trade-offs among precision, speed, and robustness increasingly challenging. Traditional Proportional–Integral–Derivative (PID) controllers, which rely on empirical tuning methods, suffer from prolonged trial-and-error cycles and limited transferability, and consequently struggle to maintain optimal performance under these complex working conditions. This paper proposes an adaptive β–Proximal Policy Optimization with Random Network Distillation (β-PPO-RND) parameter optimization within the Prescribed Performance Control (PPC) framework. The adaptive coefficient β is updated based on the temporal change in reward difference, which is clipped and smoothly mapped to a preset range using a hyperbolic tangent function. This mechanism dynamically balances intrinsic and extrinsic rewards—encouraging broader exploration in the early stage and emphasizing performance optimization in the later stage. Experimental validation on a Permanent Magnet Linear Synchronous Motor (PMLSM) platform confirms the effectiveness of the proposed approach. It eliminates the need for manual tuning and enables real-time controller parameter adjustment within the PPC framework, achieving high-precision trajectory tracking and a significant reduction in steady-state error. Experimental results show that the proposed method achieves MAE = 0.135 and RMSE = 0.154, representing approximately 70% reductions compared to the conventional PID controller.

Keywords: adaptive β-PPO-RND; prescribed performance control; high-precision trajectory tracking; Permanent Magnet Linear Synchronous Motor; reinforcement learning; steady-state error (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/21/3439/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/21/3439/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:21:p:3439-:d:1781655

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().