Prompt Optimization in Large Language Models

Sabbatella, Antonio; Ponti, Andrea; Giordani, Ilaria; Candelieri, Antonio; Archetti, Francesco

Prompt Optimization in Large Language Models

Antonio Sabbatella, Andrea Ponti, Ilaria Giordani, Antonio Candelieri () and Francesco Archetti
Additional contact information
Antonio Sabbatella: Department of Computer Science, Systems and Communications, University of Milan-Bicocca, 20126 Milan, Italy
Andrea Ponti: Department of Economics, Management, and Statistics, University of Milan-Bicocca, 20126 Milan, Italy
Ilaria Giordani: Oaks srl, 20125 Milan, Italy
Antonio Candelieri: Department of Economics, Management, and Statistics, University of Milan-Bicocca, 20126 Milan, Italy
Francesco Archetti: Department of Computer Science, Systems and Communications, University of Milan-Bicocca, 20126 Milan, Italy

Mathematics, 2024, vol. 12, issue 6, 1-14

Abstract: Prompt optimization is a crucial task for improving the performance of large language models for downstream tasks. In this paper, a prompt is a sequence of n-grams selected from a vocabulary. Consequently, the aim is to select the optimal prompt concerning a certain performance metric. Prompt optimization can be considered as a combinatorial optimization problem, with the number of possible prompts (i.e., the combinatorial search space) given by the size of the vocabulary (i.e., all the possible n-grams) raised to the power of the length of the prompt. Exhaustive search is impractical; thus, an efficient search strategy is needed. We propose a Bayesian Optimization method performed over a continuous relaxation of the combinatorial search space. Bayesian Optimization is the dominant approach in black-box optimization for its sample efficiency, along with its modular structure and versatility. We use BoTorch, a library for Bayesian Optimization research built on top of PyTorch. Specifically, we focus on Hard Prompt Tuning, which directly searches for an optimal prompt to be added to the text input without requiring access to the Large Language Model, using it as a black-box (such as for GPT-4 which is available as a Model as a Service). Albeit preliminary and based on “vanilla” Bayesian Optimization algorithms, our experiments with RoBERTa as a large language model, on six benchmark datasets, show good performances when compared against other state-of-the-art black-box prompt optimization methods and enable an analysis of the trade-off between the size of the search space, accuracy, and wall-clock time.

Keywords: Bayesian Optimization; prompt optimization; black-box Large Language Models (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/6/929/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/6/929/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:6:p:929-:d:1361476

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().