Large Language Model-Guided SARSA Algorithm for Dynamic Task Scheduling in Cloud Computing
Bhargavi Krishnamurthy () and
Sajjan G. Shiva ()
Additional contact information
Bhargavi Krishnamurthy: Department of CSE, Siddaganga Institute of Technology, Tumakuru 572103, Karnataka, India
Sajjan G. Shiva: Department of CS, University of Memphis, Memphis, TN 38152, USA
Mathematics, 2025, vol. 13, issue 6, 1-18
Abstract:
Nowadays, more enterprises are rapidly transitioning to cloud computing as it has become an ideal platform to perform the development and deployment of software systems. Because of its growing popularity, around ninety percent of enterprise applications rely on cloud computing solutions. The inherent dynamic and uncertain nature of cloud computing makes it difficult to accurately measure the exact state of a system at any given point in time. Potential challenges arise with respect to task scheduling, load balancing, resource allocation, governance, compliance, migration, data loss, and lack of resources. Among all challenges, task scheduling is one of the main problems as it reduces system performance due to improper utilization of resources. State Action Reward Action (SARSA) learning, a policy variant of Q learning, which learns the value function based on the current policy action, has been utilized in task scheduling. But it lacks the ability to provide better heuristics for state and action pairs, resulting in biased solutions in a highly dynamic and uncertain computing environment like cloud. In this paper, the SARSA learning ability is enriched by the guidance of the Large Language Model (LLM), which uses LLM heuristics to formulate the optimal Q function. This integration of the LLM and SARSA for task scheduling provides better sampling efficiency and also reduces the bias in task allocation. The heuristic value generated by the LLM is capable of mitigating the performance bias and also ensuring the model is not susceptible to hallucination. This paper provides the mathematical modeling of the proposed LLM_SARSA for performance in terms of the rate of convergence, reward shaping, heuristic values, under-/overestimation on non-optimal actions, sampling efficiency, and unbiased performance. The implementation of the LLM_SARSA is carried out using the CloudSim express open-source simulator by considering the Google cloud dataset composed of eight different types of clusters. The performance is compared with recent techniques like reinforcement learning, optimization strategy, and metaheuristic strategy. The LLM_SARSA outperforms the existing works with respect to the makespan time, degree of imbalance, cost, and resource utilization. The experimental results validate the inference of mathematical modeling in terms of the convergence rate and better estimation of the heuristic value to optimize the value function of the SARSA learning algorithm.
Keywords: task scheduling; large language model; state action reward state action; cloud computing (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/6/926/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/6/926/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:6:p:926-:d:1609985
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().