Binary Matrix Factorization and Completion via Integer Programming
Oktay Günlük (),
Raphael Andreas Hauser () and
Réka Ágnes Kovács ()
Additional contact information
Oktay Günlük: Cornell University, Ithaca, New York 14850
Raphael Andreas Hauser: University of Oxford, Oxford OX1 3AZ, United Kingdom; The Alan Turing Institute, London NW1 2DB, United Kingdom
Réka Ágnes Kovács: University of Oxford, Oxford OX1 3AZ, United Kingdom; The Alan Turing Institute, London NW1 2DB, United Kingdom
Mathematics of Operations Research, 2024, vol. 49, issue 2, 1278-1302
Abstract:
Binary matrix factorization is an essential tool for identifying discrete patterns in binary data. In this paper, we consider the rank- k binary matrix factorization problem ( k -BMF) under Boolean arithmetic: we are given an n × m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n × k and k × m , respectively, which minimize the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k -BMF and show that the compact IP has a weak linear programming (LP) relaxation, whereas the exponential size IPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank- k factorization. For one of the exponential size Ips, we describe a computational approach based on column generation. Experimental results on synthetic and real-world data sets suggest that our integer programming approach is competitive against available methods for k -BMF and provides accurate low-error factorizations.
Keywords: Primary: 90C10; binary matrix factorization; binary matrix completion; column generation; integer programming (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/moor.2023.1386 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:49:y:2024:i:2:p:1278-1302
Access Statistics for this article
More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().