Wireless Channel Selection with Restless Bandits

Kuhn, Julia; Nazarathy, Yoni

Wireless Channel Selection with Restless Bandits

Julia Kuhn () and Yoni Nazarathy ()
Additional contact information
Julia Kuhn: The University of Queensland
Yoni Nazarathy: The University of Queensland

Chapter Chapter 18 in Markov Decision Processes in Practice, 2017, pp 463-485 from Springer

Abstract: Abstract Wireless devices are often able to communicate on several alternative channels; for example, cellular phones may use several frequency bands and are equipped with base-station communication capability together with WiFi and Bluetooth communication. Automatic decision support systems in such devices need to decide which channels to use at any given time so as to maximize the long-run average throughput. A good decision policy needs to take into account that, due to cost, energy, technical, or performance constraints, the state of a channel is only sensed when it is selected for transmission. Therefore, the greedy strategy of always exploiting those channels assumed to yield the currently highest transmission rate is not necessarily optimal with respect to long-run average throughput. Rather, it may be favourable to give some priority to the exploration of channels of uncertain quality. In this chapter we model such on-line control problems as a special type of Restless Multi-Armed Bandit (RMAB) problem in a partially observable Markov decision process framework. We refer to such models as Reward-Observing Restless Multi-Armed Bandit (RORMAB) problems. These types of optimal control problems were previously considered in the literature in the context of: (i) the Gilbert-Elliot (GE) channels (where channels are modelled as a two state Markov chain), and (ii) Gaussian autoregressive (AR) channels of order 1. A virtue of this chapter is that we unify the presentation of both types of models under the umbrella of our newly defined RORMAB. Further, since RORMAB is a special type of RMAB we also present an account of RMAB problems together with a pedagogical development of the Whittle index which provides an approximately optimal control method. Numerical examples are provided.

Date: 2017
References: Add references at CitEc
Citations: View citations in EconPapers (1)

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:isochp:978-3-319-47766-4_18

Ordering information: This item can be ordered from
http://www.springer.com/9783319477664

DOI: 10.1007/978-3-319-47766-4_18

Access Statistics for this chapter

More chapters in International Series in Operations Research & Management Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().