Off-Policy Optimization of Portfolio Allocation Policies under Constraints

Bandi, Nymisha; Tulabandhula, Theja

Off-Policy Optimization of Portfolio Allocation Policies under Constraints

Nymisha Bandi and Theja Tulabandhula

Abstract: The dynamic portfolio optimization problem in finance frequently requires learning policies that adhere to various constraints, driven by investor preferences and risk. We motivate this problem of finding an allocation policy within a sequential decision making framework and study the effects of: (a) using data collected under previously employed policies, which may be sub-optimal and constraint-violating, and (b) imposing desired constraints while computing near-optimal policies with this data. Our framework relies on solving a minimax objective, where one player evaluates policies via off-policy estimators, and the opponent uses an online learning strategy to control constraint violations. We extensively investigate various choices for off-policy estimation and their corresponding optimization sub-routines, and quantify their impact on computing constraint-aware allocation policies. Our study shows promising results for constructing such policies when back-tested on historical equities data, under various regimes of operation, dimensionality and constraints.

Date: 2020-12
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2012.11715 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2012.11715

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().