Ordinal Dynamic Programming
Matthew J. Sobel
Additional contact information
Matthew J. Sobel: Yale University
Management Science, 1975, vol. 21, issue 9, 967-975
Abstract:
Numerically valued reward processes are found in most dynamic programming models. Mitten, however, recently formulated finite horizon sequential decision processes in which a real-valued reward need not be earned at each stage. Instead of the cardinality assumption implicit in past models, Mitten assumes that a decision maker has a preference order over a general collection of outcomes (which need not be numerically valued). This paper investigates infinite horizon ordinal dynamic programming models. Both deterministic and stochastic models are considered. It is shown that an optimal policy exists if and only if some stationary policy is optimal. Moreover, "policy improvement" leads to better policies using either Howard-Blackwell or Eaton-Zadeh procedures. The results illuminate the roles played by various sets of assumptions in the literature on Markovian decision processes.
Date: 1975
References: Add references at CitEc
Citations: View citations in EconPapers (7)
Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.21.9.967 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:21:y:1975:i:9:p:967-975
Access Statistics for this article
More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().