Global synchromodal shipment matching problem with dynamic and stochastic travel times: a reinforcement learning approach

Guo, W.; Atasoy, B.; Negenborn, R. R.

Global synchromodal shipment matching problem with dynamic and stochastic travel times: a reinforcement learning approach

W. Guo (), B. Atasoy and R. R. Negenborn
Additional contact information
W. Guo: University of Quebec at Montreal
B. Atasoy: Delft University of Technology
R. R. Negenborn: Delft University of Technology

Annals of Operations Research, 2025, vol. 350, issue 1, No 4, 63-94

Abstract: Abstract Global synchromodal transportation involves the movement of container shipments between inland terminals located in different continents using ships, barges, trains, trucks, or any combination among them through integrated planning at a network level. One of the challenges faced by global operators is the matching of accepted shipments with services in an integrated global synchromodal transport network with dynamic and stochastic travel times. The travel times of services are unknown and revealed dynamically during the execution of transport plans, but the stochastic information of travel times are assumed available. Matching decisions can be updated before shipments arrive at their destination terminals. The objective of the problem is to maximize the total profits that are expressed in terms of a combination of revenues, travel costs, transfer costs, storage costs, delay costs, and carbon tax over a given planning horizon. We propose a sequential decision process model to describe the problem. In order to address the curse of dimensionality, we develop a reinforcement learning approach to learn the value of matching a shipment with a service through simulations. Specifically, we adopt the Q-learning algorithm to update value function estimations and use the $$\epsilon $$ ϵ -greedy strategy to balance exploitation and exploration. Online decisions are created based on the estimated value functions. The performance of the reinforcement learning approach is evaluated in comparison to a myopic approach that does not consider uncertainties and a stochastic approach that sets chance constraints on feasible transshipment under a rolling horizon framework.

Keywords: Global synchromodal shipment matching; Dynamic and stochastic travel times; Sequential decision process; Reinforcement learning; Q-learning (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10479-021-04489-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:350:y:2025:i:1:d:10.1007_s10479-021-04489-z

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479

DOI: 10.1007/s10479-021-04489-z

Access Statistics for this article

Annals of Operations Research is currently edited by Endre Boros

More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().