Uncertainty-Aware Continual Reinforcement Learning via PPO with Graph Representation Learning

Kim, Dongjae

Uncertainty-Aware Continual Reinforcement Learning via PPO with Graph Representation Learning

Dongjae Kim ()
Additional contact information
Dongjae Kim: Department of AI-Based Convergence, Dankook University, 152 Jukjeon-ro, Yongin-si 16890, Republic of Korea

Mathematics, 2025, vol. 13, issue 16, 1-12

Abstract: Continual reinforcement learning (CRL) agents face significant challenges when encountering distributional shifts. This paper formalizes these shifts into two key scenarios, namely virtual drift (domain switches), where object semantics change (e.g., walls becoming lava), and concept drift (task switches), where the environment’s structure is reconfigured (e.g., moving from object navigation to a door key puzzle). This paper demonstrates that while conventional convolutional neural networks (CNNs) struggle to preserve relational knowledge during these transitions, graph convolutional networks (GCNs) can inherently mitigate catastrophic forgetting by encoding object interactions through explicit topological reasoning. A unified framework is proposed that integrates GCN-based state representation learning with a proximal policy optimization (PPO) agent. The GCN’s message-passing mechanism preserves invariant relational structures, which diminishes performance degradation during abrupt domain switches. Experiments conducted in procedurally generated MiniGrid environments show that the method significantly reduces catastrophic forgetting in domain switch scenarios. While showing comparable mean performance in task switch scenarios, our method demonstrates substantially lower performance variance (Levene’s test, p < 1.0 × 10 − 10 ), indicating superior learning stability compared to CNN-based methods. By bridging graph representation learning with robust policy optimization in CRL, this research advances the stability of decision-making in dynamic environments and establishes GCNs as a principled alternative to CNNs for applications requiring stable, continual learning.

Keywords: continual reinforcement learning; domain adaptation; graph convolutional networks; catastrophic forgetting; decision-making under uncertainty (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/16/2542/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/16/2542/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:16:p:2542-:d:1720145

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().