Risk-Aware Multi-Armed Bandit Problem with Application to Portfolio Selection

Huo, Xiaoguang; Fu, Feng

Risk-Aware Multi-Armed Bandit Problem with Application to Portfolio Selection

Xiaoguang Huo and Feng Fu

Abstract: Sequential portfolio selection has attracted increasing interests in the machine learning and quantitative finance communities in recent years. As a mathematical framework for reinforcement learning policies, the stochastic multi-armed bandit problem addresses the primary difficulty in sequential decision making under uncertainty, namely the exploration versus exploitation dilemma, and therefore provides a natural connection to portfolio selection. In this paper, we incorporate risk-awareness into the classic multi-armed bandit setting and introduce an algorithm to construct portfolio. Through filtering assets based on the topological structure of financial market and combining the optimal multi-armed bandit policy with the minimization of a coherent risk measure, we achieve a balance between risk and return.

Date: 2017-09
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (7)

Downloads: (external link)
http://arxiv.org/pdf/1709.04415 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:1709.04415

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().