Hidden Hamiltonian Cycle Recovery via Linear Programming
Vivek Bagaria (),
Jian Ding (),
David Tse (),
Yihong Wu () and
Jiaming Xu ()
Additional contact information
Vivek Bagaria: Department of Electrical Engineering, Stanford University, Stanford, California 94305
Jian Ding: Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
David Tse: Department of Electrical Engineering, Stanford University, Stanford, California 94305
Yihong Wu: Department of Statistics and Data Science, Yale University, New Haven, Connecticut 06511
Jiaming Xu: Fuqua School of Business, Duke University, Durham, North Carolina 27708
Operations Research, 2020, vol. 68, issue 1, 53-70
Abstract:
We introduce the problem of hidden Hamiltonian cycle recovery, where there is an unknown Hamiltonian cycle in an n -vertex complete graph that needs to be inferred from noisy edge measurements. The measurements are independent and distributed according to P n for edges in the cycle and Q n otherwise. This formulation is motivated by a problem in genome assembly, where the goal is to order a set of contigs (genome subsequences) according to their positions on the genome using long-range linking measurements between the contigs. Computing the maximum likelihood estimate in this model reduces to a traveling salesman problem (TSP). Despite the NP-hardness of TSP, we show that a simple linear programming (LP) relaxation—namely, the fractional 2-factor (F2F) LP—recovers the hidden Hamiltonian cycle with high probability as n → ∞ provided that α n − log n → ∞ , where α n ≜ − 2 log ∫ d P n d Q n is the Rényi divergence of order 1 2 . This condition is information-theoretically optimal in the sense that, under mild distributional assumptions, α n ≥ ( 1 + o ( 1 ) ) log n is necessary for any algorithm to succeed regardless of the computational cost. Departing from the usual proof techniques based on dual witness construction, the analysis relies on the combinatorial characterization (in particular, the half-integrality) of the extreme points of the F2F polytope. Represented as bicolored multigraphs, these extreme points are further decomposed into simpler “blossom-type” structures for the large deviation analysis and counting arguments. Evaluation of the algorithm on real data shows improvements over existing approaches.
Keywords: traveling salesman problem; fractional 2-factor linear programming; polyhedral combinatorics; large deviations theory (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://doi.org/10.1287/opre.2019.1886 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:68:y:2020:i:1:p:53-70
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().