PTB: A deep reinforcement learning method for flexible logistics service combination problem with spatial-temporal constraint

Tian, Ran; Chang, Longlong; Sun, Zhihui; Zhao, Guanglu; Lu, Xin

PTB: A deep reinforcement learning method for flexible logistics service combination problem with spatial-temporal constraint

Ran Tian, Longlong Chang, Zhihui Sun, Guanglu Zhao and Xin Lu

Transportation Research Part E: Logistics and Transportation Review, 2025, vol. 195, issue C

Abstract: The Fourth Party Logistics Service platform aims to provide customers with one-stop logistics services by combining logistics services from different logistics service providers. However, it is an important challenge to quickly build the logistics service combination with the maximum overall profit under the premise of satisfying spatial–temporal constraints. The traditional heuristic algorithm has the problem of low computing efficiency and insufficient adaptability to dynamic environment and demand uncertainty when dealing with the problem of large-scale logistics service combination with spatial–temporal constraints and uncertainty of service. Therefore, we propose the Proximal policy optimization algorithm integrating the Transformer network and the Bayesian networks (PTB). Firstly, a variable length sequence module based on the greedy strategy is used in the environment to generate the status information of the order, to solve the problem that the traditional reinforcement learning environment cannot handle any number of service providers. Then, a policy network integrating the Transformer and Bayes network is used to further optimize the accuracy and reliability of scheduling decisions. Meanwhile the PPO algorithm is used to update the network and constantly optimize the scheduling decisions of agents. Finally, the experimental results on four sets of logistics order scenes show that PTB can effectively solve the logistics service combination problem with spatial–temporal constraints, and the results of most logistics service combination tasks are better than other baseline models, and in large-scale order scenes, it has far more computational efficiency than the traditional heuristic algorithm, and shows good generalization ability.

Keywords: Spatial–temporal constraints; Logistics service combination; Deep reinforcement learning; Transformer; Bayesian network (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1366554525000195
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:transe:v:195:y:2025:i:c:s1366554525000195

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/journaldescription.cws_home/600244/bibliographic
http://www.elsevier. ... 600244/bibliographic

DOI: 10.1016/j.tre.2025.103978

Access Statistics for this article

Transportation Research Part E: Logistics and Transportation Review is currently edited by W. Talley

More articles in Transportation Research Part E: Logistics and Transportation Review from Elsevier
Bibliographic data for series maintained by Catherine Liu ().