Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk
Peter Bossaerts,
Shijie Huang and
Nitin Yadav
Additional contact information
Peter Bossaerts: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia
Shijie Huang: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia
Nitin Yadav: Brain Mind and Markets Lab, The University of Melbourne, 198 Berkeley Street, Parkville, VIC 3010, Australia
Risks, 2020, vol. 8, issue 4, 1-20
Abstract:
In traditional Reinforcement Learning (RL), agents learn to optimize actions in a dynamic context based on recursive estimation of expected values. We show that this form of machine learning fails when rewards (returns) are affected by tail risk, i.e., leptokurtosis. Here, we adapt a recent extension of RL, called distributional RL (disRL), and introduce estimation efficiency, while properly adjusting for differential impact of outliers on the two terms of the RL prediction error in the updating equations. We show that the resulting “efficient distributional RL” (e-disRL) learns much faster, and is robust once it settles on a policy. Our paper also provides a brief, nontechnical overview of machine learning, focusing on RL.
Keywords: distributional reinforcement learning; markov decision process; leptokurtic distribution; tail risk; efficient estimator (search for similar items in EconPapers)
JEL-codes: C G0 G1 G2 G3 K2 M2 M4 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-9091/8/4/113/pdf (application/pdf)
https://www.mdpi.com/2227-9091/8/4/113/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jrisks:v:8:y:2020:i:4:p:113-:d:434660
Access Statistics for this article
Risks is currently edited by Mr. Claude Zhang
More articles in Risks from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().