Using Transformers and Reinforcement Learning for the Team Orienteering Problem Under Dynamic Conditions

Guerrero, Antoni; Escoto, Marc; Ammouriova, Majsa; Men, Yangchongyi; Juan, Angel

Using Transformers and Reinforcement Learning for the Team Orienteering Problem Under Dynamic Conditions

Antoni Guerrero, Marc Escoto, Majsa Ammouriova, Yangchongyi Men and Angel Juan
Additional contact information
Antoni Guerrero: Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain
Marc Escoto: Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain
Majsa Ammouriova: School of Applied Technical Sciences, German Jordanian University, Amman 11180, Jordan
Yangchongyi Men: Production Management and Engineering Research Centre, Universitat Politècnica de València, Plz. Ferrandiz-Salvador, 03801 Alcoy, Spain

Mathematics, 2025, vol. 13, issue 14, 1-19

Abstract: This paper presents a reinforcement learning (RL) approach for solving the team orienteering problem under both deterministic and dynamic travel time conditions. The proposed method builds on the transformer architecture and is trained to construct routes that adapt to real-time variations, such as traffic and environmental changes. A key contribution of this work is the model’s ability to generalize across problem instances with varying numbers of nodes and vehicles, eliminating the need for retraining when problem size changes. To assess performance, a comprehensive set of experiments involving 27,000 synthetic instances is conducted, comparing the RL model with a variable neighborhood search metaheuristic. The results indicate that the RL model achieves competitive solution quality while requiring significantly less computational time. Moreover, the RL approach consistently produces feasible solutions across all dynamic instances, demonstrating strong robustness in meeting time constraints. These findings suggest that learning-based methods can offer efficient, scalable, and adaptable solutions for routing problems in dynamic and uncertain environments.

Keywords: team orienteering problem; reinforcement learning; dynamic conditions; model generalization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/14/2313/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/14/2313/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:14:p:2313-:d:1705769

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().