EconPapers    
Economics at your fingertips  
 

Joint Learning of Volume Scheduling and Order Placement Policies for Optimal Order Execution

Siyuan Li (), Hui Niu, Jiani Lu and Peng Liu
Additional contact information
Siyuan Li: Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
Hui Niu: Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
Jiani Lu: Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
Peng Liu: Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China

Mathematics, 2024, vol. 12, issue 21, 1-17

Abstract: Order execution is an extremely important problem in the financial domain, and recently, more and more researchers have tried to employ reinforcement learning (RL) techniques to solve this challenging problem. There are a lot of difficulties for conventional RL methods to tackle the order execution problem, such as the large action space including price and quantity, and the long-horizon property. As naturally order execution is composed of a low-frequency volume scheduling stage and a high-frequency order placement stage, most existing RL-based order execution methods treat these stages as two distinct tasks and offer a partial solution by addressing either one individually. However, the current literature fails to model the non-negligible mutual influence between these two tasks, leading to impractical order execution solutions. To address these limitations, we propose a novel automatic order execution approach based on the hierarchical RL framework (OEHRL), which jointly learns the policies for volume scheduling and order placement. OEHRL first extracts the state embeddings at both the macro and micro levels with a sequential variational auto-encoder model. Based on the effective embeddings, OEHRL generates a hindsight expert dataset, which is used to train a hierarchical order execution policy. In the hierarchical structure, the high-level policy is in charge of the target volume and the low-level learns to determine the prices for a series of the allocated sub-orders from the high level. These two levels collaborate seamlessly and contribute to the optimal order execution policy. Extensive experiment results on 200 stocks across the US and China A-share markets validate the effectiveness of the proposed approach.

Keywords: order execution; hierarchical reinforcement learning; imitation learning representation learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/21/3440/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/21/3440/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:21:p:3440-:d:1513350

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3440-:d:1513350