Improving Variable Orderings of Approximate Decision Diagrams Using Reinforcement Learning

Cappart, Quentin; Bergman, David; Rousseau, Louis-Martin; Prémont-Schwarz, Isabeau; Parjadis, Augustin

Improving Variable Orderings of Approximate Decision Diagrams Using Reinforcement Learning

Quentin Cappart (), David Bergman (), Louis-Martin Rousseau (), Isabeau Prémont-Schwarz () and Augustin Parjadis ()
Additional contact information
Quentin Cappart: Ecole Polytechnique de Montréal, Montreal H3T 1J4, Canada; ServiceNow Research (formerly Element AI), Montreal H2S 3G9, Canada
David Bergman: University of Connecticut, Stamford, Connecticut 06901
Louis-Martin Rousseau: Ecole Polytechnique de Montréal, Montreal H3T 1J4, Canada
Isabeau Prémont-Schwarz: ServiceNow Research (formerly Element AI), Montreal H2S 3G9, Canada
Augustin Parjadis: Ecole Polytechnique de Montréal, Montreal H3T 1J4, Canada

INFORMS Journal on Computing, 2022, vol. 34, issue 5, 2552-2570

Abstract: Prescriptive analytics provides organizations with scalable solutions for large-scale, automated decision making. At the core of prescriptive analytics methodology is optimization, a field devoted to the study of algorithms that solve complex decision-making problems. Optimization algorithms rely heavily on generic methods for identifying tight bounds, which provide both solutions to problems and optimality guarantees. In the last decade, decision diagrams (DDs) have demonstrated significant advantages in obtaining bounds compared with the standard linear relaxation commonly used by commercial solvers. However, the quality of the bounds computed by DDs depends heavily on the variable ordering chosen for the construction. Besides, the problem of finding an ordering that optimizes a given metric is generally NP-hard. This paper studies how machine learning, specifically deep reinforcement learning (DRL), can be used to improve bounds provided by DDs, in particular through learning a good variable ordering. The introduced DRL models improve primal and dual bounds, even over standard linear programming relaxations, and are integrated in a full-fledged branch-and-bound algorithm. This paper, therefore, provides a novel mechanism for utilizing machine learning to tighten bounds, adding to recent research on using machine learning to obtain high-quality heuristic solutions and, for the first time, using machine learning to improve relaxation bounds through a generic bounding method. We apply the methods on a classic optimization problem, the maximum independent set, and demonstrate through computational testing that optimization bounds can be significantly improved through DRL. We provide the code to replicate the results obtained on the maximum independent set. Summary of Contribution: This paper studies the use of reinforcement learning to compute a variable ordering of decision diagram-based approximations for discrete optimization problems. This is among the first works to propose the use of machine learning to improve upon generic bounding methods for discrete optimization problems, thereby establishing a critical bridge between optimization and learning.

Keywords: optimization bounds; decision diagrams; deep reinforcement learning (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://dx.doi.org/10.1287/ijoc.2022.1194 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orijoc:v:34:y:2022:i:5:p:2552-2570

Access Statistics for this article

More articles in INFORMS Journal on Computing from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().