A Q-learning-based algorithm for the block relocation problem

Liu, Liqun; Feng, Yuanjun; Zeng, Qingcheng; Chen, Zhijun; Li, Yaqiu

A Q-learning-based algorithm for the block relocation problem

Liqun Liu, Yuanjun Feng (), Qingcheng Zeng, Zhijun Chen and Yaqiu Li
Additional contact information
Liqun Liu: University of Leeds
Yuanjun Feng: University of Liverpool
Qingcheng Zeng: Dalian Maritime University
Zhijun Chen: Wuhan University of Technology
Yaqiu Li: Hiroshima University

Journal of Heuristics, 2025, vol. 31, issue 1, No 14, 41 pages

Abstract: Abstract The Block Relocation Problem (BRP), also known as the Container Relocation Problem, is a challenging combinatorial optimization problem in block stacking systems and has many applications in real-world scenarios such as logistics and manufacturing industry. The BRP is about finding the optimal way to retrieve blocks from a storage area with the objective of minimizing the number of relocations. The BRPs have been studied for a long time, and have been solved primarily using conventional optimization techniques, including mathematical programming models, as well as both exact and heuristic algorithms. For the first time, this paper tackles the problem using a reinforcement learning method. We focus on one of the major variants of the BRP—the restricted BRP with duplicate priorities (RBRP-dup). We first model the RBRP-dup as a Markov decision process and then propose a Q-learning-based algorithm to solve the problem. The Q-learning-based algorithm contains two phases. In the learning phase, two innovative mechanisms: an optimal rule-integrated behaviour policy and a heuristic-based dynamic initialization method, are incorporated into the Q-learning model to reduce the size of the state-action space and accelerate convergence. In the optimization phase, the insights obtained in the learning phase are combined with a heuristic algorithm to improve decision-making. The performance of our proposed method is evaluated against the state-of-the-art exact algorithm and a commonly used heuristic algorithm based on benchmark instances from the literature. The computational experiments demonstrate the superiority of our proposed method regarding solution quality in large and complex instances.

Keywords: Block relocation problem; Container relocation problem; Q-learning; Reinforcement learning; Optimization; Benchmarking (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10732-024-09545-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:joheur:v:31:y:2025:i:1:d:10.1007_s10732-024-09545-y

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10732

DOI: 10.1007/s10732-024-09545-y

Access Statistics for this article

Journal of Heuristics is currently edited by Manuel Laguna

More articles in Journal of Heuristics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().