Domain-Aware Reinforcement Learning for Prompt Optimization
Mengqi Gao,
Bowen Sun (),
Tong Wang,
Ziyu Fan,
Tongpo Zhang and
Zijun Zheng
Additional contact information
Mengqi Gao: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Bowen Sun: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Tong Wang: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Ziyu Fan: Department of Engineering, Durham University, Durham DH1 3LE, UK
Tongpo Zhang: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Zijun Zheng: College of Sciences, China Jiliang University, Hangzhou 310018, China
Mathematics, 2025, vol. 13, issue 16, 1-20
Abstract:
Prompt engineering provides an efficient way to adapt large language models (LLMs) to downstream tasks without retraining model parameters. However, designing effective prompts can be challenging, especially when model gradients are unavailable and human expertise is required. Existing automated methods based on gradient optimization or heuristic search exhibit inherent limitations under black box or limited-query conditions. We propose Domain-Aware Reinforcement Learning for Prompt Optimization (DA-RLPO), which treats prompt editing as a sequential decision process and leverages structured domain knowledge to constrain candidate edits. Our experimental results show that DA-RLPO achieves higher accuracy than baselines on text classification tasks and maintains robust performance with limited API calls, while also demonstrating effectiveness on text-to-image and reasoning tasks.
Keywords: prompt optimization; knowledge base; reinforcement learning; entropy regularization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/16/2552/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/16/2552/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:16:p:2552-:d:1721023
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().