An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem
Yang Zou,
Hecheng Wu,
Yunqiang Yin (),
Lalitha Dhamotharan,
Daqiang Chen and
Aviral Tiwari
Additional contact information
Yang Zou: Nanjing University of Aeronautics and Astronautics
Hecheng Wu: Nanjing University of Aeronautics and Astronautics
Yunqiang Yin: University of Electronic Science and Technology of China
Lalitha Dhamotharan: University of Exeter
Daqiang Chen: Zhejiang Gongshang University
Annals of Operations Research, 2024, vol. 339, issue 1, No 20, 517-536
Abstract:
Abstract Low-carbon logistics is an emerging and sustainable development industry in the era of a low-carbon economy. The end-to-end deep reinforcement learning (DRL) method with an encoder-decoder framework has been proven effective for solving logistics problems. However, in most cases, the recurrent neural networks (RNN) and attention mechanisms are used in encoders and decoders, which may result in the long-distance dependence problem and the neglect of the correlation between query vectors. To surround this problem, we propose an improved transformer model (TAOA) with both multi-head attention mechanism (MHA) and attention to attention mechanism (AOA), and apply it to solve the low-carbon multi-depot vehicle routing problem (MDVRP). In this model, the MHA and AOA are implemented to solve the probability of route nodes in the encoder and decoder. The MHA is used to process different parts of the input sequence, which can be calculated in parallel, and the AOA is used to deal with the deficiency problem of correlation between query results and query vectors in the MHA. The actor-critic framework based on strategy gradient is constructed to train model parameters. The 2opt operator is further used to optimize the resulting routes. Finally, extensive numerical studies are carried out to verify the effectiveness and operation efficiency of the proposed TAOA, and the results show that the proposed TAOA performs better in solving the MDVRP than the traditional transformer model (Kools), genetic algorithm (GA), and Google OR-Tools (Ortools).
Keywords: End-to-end deep reinforcement learning; Transformer model; Multi-head attention mechanism; Low-carbon multi-depot vehicle routing problem; GA algorithm (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10479-022-04788-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:339:y:2024:i:1:d:10.1007_s10479-022-04788-z
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479
DOI: 10.1007/s10479-022-04788-z
Access Statistics for this article
Annals of Operations Research is currently edited by Endre Boros
More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().