An Improved Q-Learning Algorithm for Optimizing Sustainable Remanufacturing Systems
Shujin Qin,
Xiaofei Zhang,
Jiacun Wang,
Xiwang Guo,
Liang Qi (),
Jinrui Cao and
Yizhi Liu
Additional contact information
Shujin Qin: College of Economics and Management, Shangqiu Normal University, Shangqiu 476000, China
Xiaofei Zhang: College of Information and Control Engineering, Liaoning Petrochemical University, Fushun 113001, China
Jiacun Wang: Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ 07764, USA
Xiwang Guo: College of Information and Control Engineering, Liaoning Petrochemical University, Fushun 113001, China
Liang Qi: Department of Computer Science and Technology, Shandong University of Science and Technology, Qingdao 266590, China
Jinrui Cao: Computer Science Department, New Jersey City University, Jersey City, NJ 07102, USA
Yizhi Liu: College of Information and Control Engineering, Liaoning Petrochemical University, Fushun 113001, China
Sustainability, 2024, vol. 16, issue 10, 1-18
Abstract:
In our modern society, there has been a noticeable increase in pollution due to the trend of post-use handling of items. This necessitates the adoption of recycling and remanufacturing processes, advocating for sustainable resource management. This paper aims to address the issue of disassembly line balancing. Existing disassembly methods largely rely on manual labor, raising concerns regarding safety and sustainability. This paper proposes a human–machine collaborative disassembly approach to enhance safety and optimize resource utilization, aligning with sustainable development goals. A mixed-integer programming model is established, considering various disassembly techniques for hazardous and delicate parts, with the objective of minimizing the total disassembly time. The CPLEX solver is employed to enhance model accuracy. An improvement is made to the Q-learning algorithm in reinforcement learning to tackle the bilateral disassembly line balancing problem in human–machine collaboration. This approach outperforms CPLEX in both solution efficiency and quality, especially for large-scale problems. A comparative analysis with the original Q-learning algorithm and SARSA algorithm validates the superiority of the proposed algorithm in terms of convergence speed and solution quality.
Keywords: reinforcement learning; improved Q-learning algorithm; two-sided disassembly line balancing; human–robot collaboration (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/16/10/4180/pdf (application/pdf)
https://www.mdpi.com/2071-1050/16/10/4180/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:16:y:2024:i:10:p:4180-:d:1395855
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().