Advancing Sustainable Manufacturing: Reinforcement Learning with Adaptive Reward Machine Using an Ontology-Based Approach

Golpayegani, Fatemeh; Ghanadbashi, Saeedeh; Zarchini, Akram

Advancing Sustainable Manufacturing: Reinforcement Learning with Adaptive Reward Machine Using an Ontology-Based Approach

Fatemeh Golpayegani, Saeedeh Ghanadbashi () and Akram Zarchini
Additional contact information
Fatemeh Golpayegani: School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
Saeedeh Ghanadbashi: School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland
Akram Zarchini: Department of Computer Engineering, Sharif University of Technology, Tehran 11155-9517, Iran

Sustainability, 2024, vol. 16, issue 14, 1-24

Abstract: Sustainable manufacturing practices are crucial in job shop scheduling (JSS) to enhance the resilience of production systems against resource shortages and regulatory changes, contributing to long-term operational stability and environmental care. JSS involves rapidly changing conditions and unforeseen disruptions that can lead to inefficient resource use and increased waste. However, by addressing these uncertainties, we can promote more sustainable operations. Reinforcement learning-based job shop scheduler agents learn through trial and error by receiving scheduling decisions feedback in the form of a reward function (e.g., maximizing machines working time) from the environment, with their primary challenge being the handling of dynamic reward functions and navigating uncertain environments. Recently, Reward Machines (RMs) have been introduced to specify and expose reward function structures through a finite-state machine. With RMs, it is possible to define multiple reward functions for different states and switch between them dynamically. RMs can be extended to incorporate domain-specific prior knowledge, such as task-specific objectives. However, designing RMs becomes cumbersome as task complexity increases and agents must react to unforeseen events in dynamic and partially observable environments. Our proposed Ontology-based Adaptive Reward Machine (ONTOADAPT-REWARD) model addresses these challenges by dynamically creating and modifying RMs based on domain ontologies. This adaptability allows the model to outperform a state-of-the-art baseline algorithm in resource utilization, processed orders, average waiting time, and failed orders, highlighting its potential for sustainable manufacturing by optimizing resource usage and reducing idle times.

Keywords: adaptive reward function; dynamic environments; multi-objective; ontology; partially observable environments; reinforcement learning; reward machines; sustainable manufacturing; job shop scheduling (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2071-1050/16/14/5873/pdf (application/pdf)
https://www.mdpi.com/2071-1050/16/14/5873/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:16:y:2024:i:14:p:5873-:d:1432157

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().