A Navigation Algorithm Based on the Reinforcement Learning Reward System and Optimised with Genetic Algorithm

Cabezas-Olivenza, Mireya; Zulueta, Ekaitz; Azurmendi-Marquinez, Iker; Fernandez-Gamiz, Unai; Rico-Melgosa, Danel

A Navigation Algorithm Based on the Reinforcement Learning Reward System and Optimised with Genetic Algorithm

Mireya Cabezas-Olivenza, Ekaitz Zulueta (), Iker Azurmendi-Marquinez, Unai Fernandez-Gamiz and Danel Rico-Melgosa
Additional contact information
Mireya Cabezas-Olivenza: Faculty of Engineering, Mondragon Unibertsitatea, 20500 Arrasate-Mondragon, Spain
Ekaitz Zulueta: System Engineering and Automation Control Department, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain
Iker Azurmendi-Marquinez: CS Centro Stirling S. Coop., Avda. Álava 3, 20550 Aretxabaleta, Spain
Unai Fernandez-Gamiz: Department Energy Engineering, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain
Danel Rico-Melgosa: System Engineering and Automation Control Department, University of the Basque Country (UPV/EHU), Nieves Cano, 12, 01006 Vitoria-Gasteiz, Spain

Mathematics, 2024, vol. 12, issue 24, 1-26

Abstract: Regarding autonomous vehicle navigation, reinforcement learning is a technique that has demonstrated significant results. Nevertheless, it is a technique with a high number of parameters that need to be optimised without prior information, and correctly performing this is a complicated task. In this research study, a system based on the principles of reinforcement learning, specifically on the concept of rewards, is presented. A mathematical expression was proposed to control the vehicle’s direction based on its position, the obstacles in the environment and the destination. In this equation proposal, there was only one unknown parameter that regulated the degree of the action to be taken, and this was optimised through the genetic algorithm. In this way, a less computationally expensive navigation algorithm was presented, as it avoided the use of neural networks. The controller’s time to obtain the navigation instructions was around 6.201·10 −4 s. This algorithm is an efficient and accurate system which manages not to collide with obstacles and to reach the destination from any position. Moreover, in most cases, it has been found that the proposed navigations are also optimal.

Keywords: navigation; reinforcement learning; genetic algorithm; optimisation; autonomous vehicle; q-learning; AGV (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/24/4030/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/24/4030/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:24:p:4030-:d:1549928

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().