Optimizing a Dynamic Vehicle Routing Problem with Deep Reinforcement Learning: Analyzing State-Space Components
Anna Konovalenko and
Lars Magnus Hvattum ()
Additional contact information
Anna Konovalenko: Faculty of Logistics, Molde University College, 6410 Molde, Norway
Lars Magnus Hvattum: Faculty of Logistics, Molde University College, 6410 Molde, Norway
Logistics, 2024, vol. 8, issue 4, 1-18
Abstract:
Background: The dynamic vehicle routing problem (DVRP) is a complex optimization problem that is crucial for applications such as last-mile delivery. Our goal is to develop an application that can make real-time decisions to maximize total performance while adapting to the dynamic nature of incoming orders. We formulate the DVRP as a vehicle routing problem where new customer requests arrive dynamically, requiring immediate acceptance or rejection decisions. Methods: This study leverages reinforcement learning (RL), a machine learning paradigm that operates via feedback-driven decisions, to tackle the DVRP. We present a detailed RL formulation and systematically investigate the impacts of various state-space components on algorithm performance. Our approach involves incrementally modifying the state space, including analyzing the impacts of individual components, applying data transformation methods, and incorporating derived features. Results: Our findings demonstrate that a carefully designed state space in the formulation of the DVRP significantly improves RL performance. Notably, incorporating derived features and selectively applying feature transformation enhanced the model’s decision-making capabilities. The combination of all enhancements led to a statistically significant improvement in the results compared with the basic state formulation. Conclusions: This research provides insights into RL modeling for DVRPs, highlighting the importance of state-space design. The proposed approach offers a flexible framework that is applicable to various variants of the DVRP, with potential for validation using real-world data.
Keywords: dynamic vehicle routing problem; Markov decision process; deep reinforcement learning; last-mile delivery (search for similar items in EconPapers)
JEL-codes: L8 L80 L81 L86 L87 L9 L90 L91 L92 L93 L98 L99 M1 M10 M11 M16 M19 R4 R40 R41 R49 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2305-6290/8/4/96/pdf (application/pdf)
https://www.mdpi.com/2305-6290/8/4/96/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jlogis:v:8:y:2024:i:4:p:96-:d:1491122
Access Statistics for this article
Logistics is currently edited by Ms. Mavis Li
More articles in Logistics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().