Domain-Aware Reinforcement Learning for Prompt Optimization

Gao, Mengqi; Sun, Bowen; Wang, Tong; Fan, Ziyu; Zhang, Tongpo; Zheng, Zijun

Domain-Aware Reinforcement Learning for Prompt Optimization

Mengqi Gao, Bowen Sun (), Tong Wang, Ziyu Fan, Tongpo Zhang and Zijun Zheng
Additional contact information
Mengqi Gao: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Bowen Sun: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Tong Wang: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Ziyu Fan: Department of Engineering, Durham University, Durham DH1 3LE, UK
Tongpo Zhang: School of Computer and lnformation Engineering, Shanghai Polytechnic University, Shanghai 201209, China
Zijun Zheng: College of Sciences, China Jiliang University, Hangzhou 310018, China

Mathematics, 2025, vol. 13, issue 16, 1-20

Abstract: Prompt engineering provides an efficient way to adapt large language models (LLMs) to downstream tasks without retraining model parameters. However, designing effective prompts can be challenging, especially when model gradients are unavailable and human expertise is required. Existing automated methods based on gradient optimization or heuristic search exhibit inherent limitations under black box or limited-query conditions. We propose Domain-Aware Reinforcement Learning for Prompt Optimization (DA-RLPO), which treats prompt editing as a sequential decision process and leverages structured domain knowledge to constrain candidate edits. Our experimental results show that DA-RLPO achieves higher accuracy than baselines on text classification tasks and maintains robust performance with limited API calls, while also demonstrating effectiveness on text-to-image and reasoning tasks.

Keywords: prompt optimization; knowledge base; reinforcement learning; entropy regularization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/16/2552/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/16/2552/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:16:p:2552-:d:1721023

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().