A review of approximate dynamic programming applications within military operations research

Rempel, M.; Cai, J.

A review of approximate dynamic programming applications within military operations research

M. Rempel and J. Cai

Operations Research Perspectives, 2021, vol. 8, issue C

Abstract: Sequences of decisions that occur under uncertainty arise in a variety of settings, including transportation, communication networks, finance, defence, etc. The classic approach to find an optimal decision policy for a sequential decision problem is dynamic programming; however its usefulness is limited due to the curse of dimensionality and the curse of modelling, and thus many real-world applications require an alternative approach. Within operations research, over the last 25 years the use of Approximate Dynamic Programming (ADP), known as reinforcement learning in many disciplines, to solve these types of problems has increased in popularity. These efforts have resulted in the successful deployment of ADP-generated decision policies for driver scheduling in the trucking industry, locomotive planning and management, and managing high-value spare parts in manufacturing. In this article we present the first review of applications of ADP within a defence context, specifically focusing on those which provide decision support to military or civilian leadership. This article’s main contributions are twofold. First, we review 18 decision support applications, spanning the spectrum of force development, generation, and employment, that use an ADP-based strategy and for each highlight how its ADP algorithm was designed, evaluated, and the results achieved. Second, based on the trends and gaps identified we discuss five topics relevant to applying ADP to decision support problems within defence: the classes of problems studied; best practices to evaluate ADP-generated policies; advantages of designing policies that are incremental versus complete overhauls when compared to currently practiced policies; the robustness of policies as scenarios change, such as a shift from high to low intensity conflict; and sequential decision problems not yet studied within defence that may benefit from ADP.

Keywords: Sequential decision problem; Markov decision process; Approximate dynamic programming; Reinforcement learning; Military (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S2214716021000221
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:oprepe:v:8:y:2021:i:c:s2214716021000221

DOI: 10.1016/j.orp.2021.100204

Access Statistics for this article

More articles in Operations Research Perspectives from Elsevier
Bibliographic data for series maintained by Catherine Liu ().