Robot See, Robot Do: Imitation Reward for Noisy Financial Environments

Golu\v{z}a, Sven; Kova\v{c}evi\'c, Tomislav; Begu\v{s}i\'c, Stjepan; Kostanj\v{c}ar, Zvonko

Robot See, Robot Do: Imitation Reward for Noisy Financial Environments

Sven Golu\v{z}a, Tomislav Kova\v{c}evi\'c, Stjepan Begu\v{s}i\'c and Zvonko Kostanj\v{c}ar

Abstract: The sequential nature of decision-making in financial asset trading aligns naturally with the reinforcement learning (RL) framework, making RL a common approach in this domain. However, the low signal-to-noise ratio in financial markets results in noisy estimates of environment components, including the reward function, which hinders effective policy learning by RL agents. Given the critical importance of reward function design in RL problems, this paper introduces a novel and more robust reward function by leveraging imitation learning, where a trend labeling algorithm acts as an expert. We integrate imitation (expert's) feedback with reinforcement (agent's) feedback in a model-free RL algorithm, effectively embedding the imitation learning problem within the RL paradigm to handle the stochasticity of reward signals. Empirical results demonstrate that this novel approach improves financial performance metrics compared to traditional benchmarks and RL agents trained solely using reinforcement feedback.

Date: 2024-11
New Economics Papers: this item is included in nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2411.08637 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2411.08637

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().