Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk

Bossaerts, Peter; Huang, Shijie; Yadav, Nitin

Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk

Peter Bossaerts, Shijie Huang and Nitin Yadav
Additional contact information
Peter Bossaerts: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia
Shijie Huang: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia
Nitin Yadav: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia

Risks, 2020, vol. 8, issue 4, 1-20

Abstract: In traditional Reinforcement Learning (RL), agents learn to optimize actions in a dynamic context based on recursive estimation of expected values. We show that this form of machine learning fails when rewards (returns) are affected by tail risk, i.e., leptokurtosis. Here, we adapt a recent extension of RL, called distributional RL (disRL), and introduce estimation efficiency, while properly adjusting for differential impact of outliers on the two terms of the RL prediction error in the updating equations. We show that the resulting “efficient distributional RL” (e-disRL) learns much faster, and is robust once it settles on a policy. Our paper also provides a brief, nontechnical overview of machine learning, focusing on RL.

Keywords: distributional reinforcement learning; markov decision process; leptokurtic distribution; tail risk; efficient estimator (search for similar items in EconPapers)
JEL-codes: C G0 G1 G2 G3 K2 M2 M4 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-9091/8/4/113/pdf (application/pdf)
https://www.mdpi.com/2227-9091/8/4/113/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jrisks:v:8:y:2020:i:4:p:113-:d:434660

Access Statistics for this article

Risks is currently edited by Mr. Claude Zhang

More articles in Risks from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().