Fluid Policies, Reoptimization, and Performance Guarantees in Dynamic Resource Allocation

Brown, David B.; Zhang, Jingwei

Fluid Policies, Reoptimization, and Performance Guarantees in Dynamic Resource Allocation

David B. Brown () and Jingwei Zhang ()
Additional contact information
David B. Brown: The Fuqua School of Business, Duke University, Durham, North Carolina 27708
Jingwei Zhang: School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen, China 518172

Operations Research, 2025, vol. 73, issue 2, 1029-1045

Abstract: Many sequential decision problems involve deciding how to allocate shared resources across a set of independent systems at each point in time. A classic example is the restless bandit problem, in which a budget constraint limits the selection of arms. Fluid relaxations provide a natural approximation technique for this broad class of problems. A recent stream of research has established strong performance guarantees for feasible policies based on fluid relaxations. In this paper, we generalize and improve these recent performance results. First, we provide easy-to-implement feasible fluid policies that achieve performance within O ( N ) of optimal, where N is the number of subproblems. This result holds for a general class of dynamic resource allocation problems with heterogeneous subproblems and multiple shared resource constraints. Second, we show using a novel proof technique that a feasible fluid policy that chooses actions using a reoptimized fluid value function achieves performance within O ( N ) of optimal as well. To the best of our knowledge, this performance guarantee is the first one for reoptimization for the general dynamic resource allocation problems that we consider. The scaling of the constants with respect to time in these results implies similar results in the infinite horizon setting. Finally, we develop and analyze a class of feasible fluid-budget balancing policies that stay “close” to actions selected by an optimal fluid policy while simultaneously using as much of the shared resources as possible. We show that this policy achieves performance within O (1) of optimal under particular nondegeneracy assumptions. This result generalizes recent advances for restless bandit problems by considering (a) any finite number of actions for each subproblem and (b) heterogeneous subproblems with a fixed number of types. We demonstrate the use of these techniques on dynamic multiwarehouse inventory problems and find empirically that these fluid-based policies achieve excellent performance, as our theory suggests.

Keywords: Optimization; stochastic dynamic programming; Markov; Lagrangian relaxation (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/opre.2022.0601 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:73:y:2025:i:2:p:1029-1045

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().