Optimizing collaborative filtering recommender systems with the GRPO reinforcement learning algorithm
Hwi Jeon (),
Chan-Ho Lee (),
Jong-Geun Choi () and
Hyuk-Jin Kwon ()
Edelweiss Applied Science and Technology, 2025, vol. 9, issue 8, 871-881
Abstract:
Collaborative filtering recommender systems primarily focus on short-term prediction accuracy but exhibit limitations concerning long-term user satisfaction and content diversity. In this paper, we reinterpret user-item interaction data as reinforcement learning with verifiable rewards and introduce the Group Relative Policy Optimization (GRPO) reinforcement learning algorithm, originally proposed in the large language model domain, to collaborative filtering model fine-tuning for the first time. GRPO directly updates policies without separate critic networks, balancing exploration and exploitation while optimizing long-term user engagement. In experiments conducted on Amazon review datasets covering baby products, video games, and industrial & scientific categories, the GRPO-optimized model achieved up to 15.16% improvement in Recall@10 compared to baseline models. Additionally, we revealed that user embeddings from graph-based collaborative filtering architectures positively contribute to GRPO algorithm optimization, whereas positional embeddings from sequential collaborative filtering architectures impede optimization performance. These findings empirically validate the effectiveness of the GRPO algorithm as a robust approach for recommender system model optimization.
Keywords: Collaborative filtering; GRPO; Recommender system; Reinforcement learning; RLVR. (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://learning-gate.com/index.php/2576-8484/article/view/9471/3108 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ajp:edwast:v:9:y:2025:i:8:p:871-881:id:9471
Access Statistics for this article
More articles in Edelweiss Applied Science and Technology from Learning Gate
Bibliographic data for series maintained by Melissa Fernandes ().