The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach

Dong, Xunde; Lin, Yuxin; Suo, Xudong; Wang, Xihao; Sun, Weijie

The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach

Xunde Dong, Yuxin Lin, Xudong Suo, Xihao Wang and Weijie Sun ()
Additional contact information
Xunde Dong: School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China
Yuxin Lin: School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China
Xudong Suo: Intelligent Mobile Robot Research Institute (Zhongshan), Zhongshan 528478, China
Xihao Wang: School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China
Weijie Sun: School of Automation Science and Engineering, Key Laboratory of Autonomous Systems and Networked Control, Ministry of Education, Guangdong Engineering Technology Research Center of Unmanned Aerial Vehicle System, South China University of Technology, Guangzhou 510641, China

Mathematics, 2024, vol. 12, issue 4, 1-20

Abstract: This paper investigates the output feedback (OPFB) tracking control problem for discrete-time linear (DTL) systems with unknown dynamics. To solve this problem, we use an augmented system approach, which first transforms the tracking control problem into a regulation problem with a discounted performance function. The solution to this problem is derived using a Bellman equation, based on the Q-function. In order to overcome the challenges of unmeasurable system state variables, we employ a multistep Q-learning algorithm that surpasses the advantages of the policy iteration (PI) and value iteration (VI) techniques and state reconstruction methods for output feedback control. As such, the requirement for an initial stabilizing control policy for the PI method is removed and the convergence speed of the learning algorithm is improved. Finally, we demonstrate the effectiveness of the proposed scheme using a simulation example.

Keywords: tracking; Q-learning; optimal control; output feedback; UPS (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/4/509/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/4/509/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:4:p:509-:d:1334687

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().