Control of a Markov Chain with Unknown Dynamics and Cost Structure

Lakshmivarahan, S.

Control of a Markov Chain with Unknown Dynamics and Cost Structure

S. Lakshmivarahan
Additional contact information
S. Lakshmivarahan: University of Oklahoma, School of Electrical Engineering and Computer Science

Chapter Chapter 8 in Learning Algorithms Theory and Applications, 1981, pp 228-256 from Springer

Abstract: Abstract This chapter deals with the application of the “absolutely expedient” learning algorithms (developed in chapter 3) for the problem of control of a finite state Markov chain whose transition probabilities as a function of a finite number of control actions are unknown. At any instant of time depending on the state of the Markov chain and the control action chosen a reward is incurred. It is assumed that this reward is a two valued (binary) random variable whose distribution as a function of the state and the control action is unknown, but the sequence of states actually visited by the Markov chain is available. In other words we consider a Markov chain whose dynamics and reward structure are unknown but the state is observable exactly.

Keywords: Markov Chain; Control Action; Markov Chain Model; Reward Structure; Finite Markov Chain (search for similar items in EconPapers)
Date: 1981
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-1-4612-5975-6_8

Ordering information: This item can be ordered from
http://www.springer.com/9781461259756

DOI: 10.1007/978-1-4612-5975-6_8

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().