EconPapers    
Economics at your fingertips  
 

Miss-Triggered Content Cache Replacement Under Partial Observability: Transformer-Decoder Q-Learning

Hakho Kim, Teh-Jen Sun and Eui-Nam Huh ()
Additional contact information
Hakho Kim: Department of Artificial Intelligence, Kyung Hee University, Yongin 17104, Republic of Korea
Teh-Jen Sun: Department of Artificial Intelligence, Kyung Hee University, Yongin 17104, Republic of Korea
Eui-Nam Huh: Department of Computer Engineering, Kyung Hee University, Yongin 17104, Republic of Korea

Mathematics, 2025, vol. 13, issue 19, 1-27

Abstract: Content delivery networks (CDNs) face steadily rising, uneven demand, straining heuristic cache replacement. Reinforcement learning (RL) is promising, but most work assumes a fully observable Markov Decision Process (MDP), unrealistic under delayed, partial, and noisy signals. We model cache replacement as a Partially Observable MDP (POMDP) and present the Miss-Triggered Cache Transformer (MTCT), a Transformer-decoder Q-learning agent that encodes recent histories with self-attention. MTCT invokes its policy only on cache misses to align compute with informative events and uses a delayed-hit reward to propagate information from hits. A compact, rank-based action set (12 actions by default) captures popularity–recency trade-offs with complexity independent of cache capacity. We evaluate MTCT on a real trace (MovieLens) and two synthetic workloads (Mandelbrot–Zipf, Pareto) against Adaptive Replacement Cache (ARC), Windowed TinyLFU (W-TinyLFU), classical heuristics, and Double Deep Q-Network (DDQN). MTCT achieves the best or statistically comparable cache-hit rates on most cache sizes; e.g., on MovieLens at M = 600 , it reaches 0.4703 (DDQN 0.4436 , ARC 0.4513 ). Miss-triggered inference also lowers mean wall-clock time per episode; Transformer inference is well suited to modern hardware acceleration. Ablations support C L = 50 and show that finer action grids improve stability and final accuracy.

Keywords: deep reinforcement learning; content cache replacement; transformer; POMDP; cache replacement (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/19/3217/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/19/3217/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:19:p:3217-:d:1766066

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-08
Handle: RePEc:gam:jmathe:v:13:y:2025:i:19:p:3217-:d:1766066