EconPapers    
Economics at your fingertips  
 

CoWPE: Adaptive Context Window Adjustment in LLMs for Complex Input Queries

Venkata Mohit Tamanampudi ()

Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023, 2024, vol. 5, issue 1, 438-450

Abstract: Recent work has shown that large language models, or LLMs, are capable of amazing processing context windows based on the nuance and complexity of respective input queries. By changing rotary position embedding (RoPE), a well-liked position encoding technique used by well-known LLMs like LLaMA and GPT-NeoX, recent studies have attempted to expand the context window of LLMs. In order to help LLMs efficiently adapt to a larger context window based on input query complexity and nuance, we identify in this work the inherent need for LLMs' attention entropy (i.e., the information entropy of attention scores) to maintain stability and introduce a novel extension to RoPE that combines adjusting RoPE's base frequency and scaling the attention logits. Our proposal, CoWPE, aims to accomplish this by building neighbor attention information and bi-level grouped attention in order to modify the context window of LLMs. While neighbor attention catches relationships between neighboring tokens within a given range, grouped attention collects interdependence among tokens that are far apart. During inference, the self-attention mechanism of the original model is utilized to calculate the two-level attentions. Our CoWPE requires no fine-tuning and can easily expand the context window of existing LLMs with a small amount of code adjustment. We carry out extensive tests on several benchmarks, and the outcomes demonstrate the CoWPE can successfully increase the context window duration of current LLMs.

Keywords: Large Language Models; LLMs; RoPE; Context Window; CoWPE; Llama; LLM training (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
https://newjaigs.com/index.php/JAIGS/article/view/221 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:das:njaigs:v:5:y:2024:i:1:p:438-450:id:221

Access Statistics for this article

Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023 is currently edited by Justyna Żywiołek

More articles in Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023 from Open Knowledge
Bibliographic data for series maintained by Open Knowledge ().

 
Page updated 2025-07-23
Handle: RePEc:das:njaigs:v:5:y:2024:i:1:p:438-450:id:221