Routing in Reinforcement Learning Markov Chains

Moll, Maximilian; Weller, Dominic

Routing in Reinforcement Learning Markov Chains

Maximilian Moll () and Dominic Weller ()
Additional contact information
Maximilian Moll: Universität der Bundeswehr München
Dominic Weller: Universität der Bundeswehr München

A chapter in Operations Research Proceedings 2021, 2022, pp 409-414 from Springer

Abstract: Abstract With computers beating human players in challenging games like Chess, Go, and StarCraft, Reinforcement Learning has gained much attention recently. The growing field of this data-driven approach to control theory has produced various promising algorithms that combine simulation for data generation, optimization, and often bootstrapping. However, underneath each of those lies the assumption that the problem can be cast as a Markov Decision Process, which extends the usual Markov Chain by assigning controls and resulting rewards to each potential transition. This assumption implies that the underlying Markov Chain and the reward, the data equivalent of an inverse cost function, form a weighted network. Consequently, the optimization problem in Reinforcement Learning can be translated to a routing problem in such possibly immense and largely unknown networks. This paper analyzes this novel interpretation and provides some first approaches to its solution.

Keywords: Reinforcement Learning; Routing (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:lnopch:978-3-031-08623-6_60

Ordering information: This item can be ordered from
http://www.springer.com/9783031086236

DOI: 10.1007/978-3-031-08623-6_60

Access Statistics for this chapter

More chapters in Lecture Notes in Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().