Inverse Markov decision processes with unknown transition probabilities

Ghatrani, Zahra; Ghate, Archis

Inverse Markov decision processes with unknown transition probabilities

Zahra Ghatrani and Archis Ghate

IISE Transactions, 2023, vol. 55, issue 6, 588-601

Abstract: Inverse optimization involves recovering parameters of a mathematical model using observed values of decision variables. In Markov Decision Processes (MDPs), it has been applied to estimate rewards that render observed policies optimal. A counterpart is not available for transition probabilities. We study two variants of this problem. First, the decision-maker wonders whether there exist a policy and transition probabilities that attain given target values of expected total discounted rewards over an infinite horizon. We derive necessary and sufficient existence conditions, and formulate a feasibility linear program whose solution yields the requisite policy and transition probabilities. We extend these results when the decision-maker wants to render the target values optimal. In the second variant, the decision-maker wishes to find transition probabilities that make a given policy optimal. The resulting problem is nonconvex bilinear, and we propose tailored versions of two heuristics called Convex-Concave Procedure and Sequential Linear Programming (SLP). Their performance is compared via numerical experiments against an exact method. Computational experiments on randomly generated MDPs reveal that SLP outperforms the other two both in runtime and objective values. Further insights into SLP’s performance are derived via numerical experiments on inverse inventory control, equipment replacement, and multi-armed bandit problems.

Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/24725854.2022.2103755 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:uiiexx:v:55:y:2023:i:6:p:588-601

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/uiie20

DOI: 10.1080/24725854.2022.2103755

Access Statistics for this article

IISE Transactions is currently edited by Jianjun Shi

More articles in IISE Transactions from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().