Robust vertex enumeration for convex hulls in high dimensions
Pranjal Awasthi (),
Bahman Kalantari () and
Yikai Zhang ()
Additional contact information
Pranjal Awasthi: Rutgers University
Bahman Kalantari: Rutgers University
Yikai Zhang: Rutgers University
Annals of Operations Research, 2020, vol. 295, issue 1, No 3, 37-73
Abstract:
Abstract The problem of computing the vertices of the convex hull of a given input set $$S= \{v_i \in \mathbb {R} ^m: i=1, \dots , n\}$$ S = { v i ∈ R m : i = 1 , ⋯ , n } is a classic and fundamental problem, studied in the context of computational geometry, linear and convex programming, machine learning and more. In this article we present All Vertex Triangle Algorithm (AVTA), a robust and efficient algorithm for this problem. On the one hand, without any assumptions, AVTA computes approximation to the subset $$\overline{S}$$ S ¯ of all K vertices of the convex hull of S so that the convex hull of the approximate subset of vertices is as close to conv(S) as desired. On the other hand, assuming a known lower bound $$\gamma $$ γ on the ratio $$\varGamma _*/R$$ Γ ∗ / R , where $$\varGamma _*$$ Γ ∗ the minimum of the distances from each vertex to the convex hull of the remaining vertices and R the diameter of S, AVTA can recover all of $$\overline{S}$$ S ¯ . Furthermore, assuming that instead of S the input is an $$\varepsilon $$ ε -perturbation of S, $$\overline{S}_\varepsilon $$ S ¯ ε , where $$\Vert v_i - v^{\varepsilon }_i \Vert \le \varepsilon R$$ ‖ v i - v i ε ‖ ≤ ε R , AVTA can compute approximation to $$conv(\overline{S}_\varepsilon )$$ c o n v ( S ¯ ε ) , to any prescribed accuracy. Also, given a lower bound to the ratio $$\varSigma _*/R$$ Σ ∗ / R , where $$\varSigma _*$$ Σ ∗ is the minimum of the distances from each vertex to the convex hull of the remaining point of S, AVTA can recover all of $$\overline{S}_\varepsilon $$ S ¯ ε . We show $$\varSigma _* \ge \rho _* \varGamma _*/R$$ Σ ∗ ≥ ρ ∗ Γ ∗ / R , where $$\rho _*$$ ρ ∗ is the minimum distance between distinct pair of points in S and prove the following main results: (1) Given any $$t \in (0,1)$$ t ∈ ( 0 , 1 ) , AVTA computes a subset $$\overline{S}^t$$ S ¯ t of $$\overline{S}$$ S ¯ of cardinality $$K^{(t)}$$ K ( t ) in $$O(n K^{(t)}(m+ t^{-2}))$$ O ( n K ( t ) ( m + t - 2 ) ) operations so that for any $$p \in conv(S)$$ p ∈ c o n v ( S ) its Euclidean distance to $$conv(\overline{S}^t)$$ c o n v ( S ¯ t ) is at most tR. (2) Given $$\gamma \le \gamma _* = \varGamma _*/R$$ γ ≤ γ ∗ = Γ ∗ / R , AVTA computes $$\overline{S}$$ S ¯ in $$O(nK(m+ \gamma ^{-2}))$$ O ( n K ( m + γ - 2 ) ) operations. (3) If K is known, the complexity of AVTA is $$O(nK(m+ \gamma _*^{-2}) \log (\gamma _*^{-1}))$$ O ( n K ( m + γ ∗ - 2 ) log ( γ ∗ - 1 ) ) . Assuming instead of S, its $$\varepsilon $$ ε -perturbation, $$S_\varepsilon $$ S ε is given, we prove (i) Given any $$t \in (0,1)$$ t ∈ ( 0 , 1 ) , AVTA computes a subset $$\overline{S}_\varepsilon ^t \subset \overline{S}_\varepsilon $$ S ¯ ε t ⊂ S ¯ ε of cardinality $$K^{(t)}_\varepsilon $$ K ε ( t ) in $$O(n K^{(t)}_\varepsilon (m+ t^{-2}))$$ O ( n K ε ( t ) ( m + t - 2 ) ) operations so that for any $$p \in conv(S)$$ p ∈ c o n v ( S ) its distance to $$conv(\overline{S}_\varepsilon ^t)$$ c o n v ( S ¯ ε t ) is at most $$(t+\varepsilon ) R$$ ( t + ε ) R . (ii) Given $$\sigma \in [4 \varepsilon , \sigma _*= \varGamma _*/R]$$ σ ∈ [ 4 ε , σ ∗ = Γ ∗ / R ] , AVTA computes $$\overline{S}_\varepsilon $$ S ¯ ε in $$O(nK_\varepsilon (m+ \sigma ^{-2}))$$ O ( n K ε ( m + σ - 2 ) ) operations, where $$K \le K_\varepsilon \le n$$ K ≤ K ε ≤ n . (iii) If $$\gamma \le \gamma _*=\varGamma _*/R$$ γ ≤ γ ∗ = Γ ∗ / R is known satisfying $$4 \varepsilon \le \gamma \rho _*/R$$ 4 ε ≤ γ ρ ∗ / R , AVTA computes $$\overline{S}_\varepsilon $$ S ¯ ε in $$O(nK_\varepsilon (m+ (\gamma \rho _*)^{-2}))$$ O ( n K ε ( m + ( γ ρ ∗ ) - 2 ) ) operations. (iv) Given $$\sigma \in [4 \varepsilon , \sigma _*]$$ σ ∈ [ 4 ε , σ ∗ ] , if K is known, AVTA computes $$\overline{S}_\varepsilon $$ S ¯ ε in $$O(nK(m+ \sigma _*^{-2}) \log (\sigma _*^{-1}))$$ O ( n K ( m + σ ∗ - 2 ) log ( σ ∗ - 1 ) ) operations. We also consider the application of AVTA in the recovery of vertices through the projection of S or $$S_\varepsilon $$ S ε under a Johnson–Lindenstrauss randomized linear projection $$L : \mathbb {R}^{m} \rightarrow \mathbb {R}^{m'}$$ L : R m → R m ′ . Denoting $$U=L(S)$$ U = L ( S ) and $$U_\varepsilon =L(S_\varepsilon )$$ U ε = L ( S ε ) , by relating the robustness parameters of conv(U) and $$conv(U_\varepsilon )$$ c o n v ( U ε ) to those of conv(S) and $$conv(S_\varepsilon )$$ c o n v ( S ε ) , we derive analogous complexity bounds for probabilistic computation of the vertex set of conv(U) or those of $$conv(U_\varepsilon )$$ c o n v ( U ε ) , or an approximation to them. Finally, we apply AVTA to design new practical algorithms for two popular machine learning problems: topic modeling and non-negative matrix factorization. For topic models, our new algorithm leads to significantly better reconstruction of the topic-word matrix than state of the art approaches of Arora et al. (International conference on machine learning, pp 280–288, 2013) and Bansal et al. (Advances in neural information processing systems, pp 1997–2005, 2014). Additionally, we provide a robust analysis of AVTA and empirically demonstrate that it can handle larger amounts of noise than existing methods. For non-negative matrix factorization we show that AVTA is competitive with existing methods that are specialized for this task in Arora et al. (Proceedings of the forty-fourth annual ACM symposium on theory of computing, ACM, pp 145–162, 2012a). We also contrast AVTA with Blum et al. (Proceedings of the twenty-seventh annual ACM-SIAM symposium on discrete algorithms, Society for Industrial and Applied Mathematics, pp 548–557, 2016) Greedy Clustering coreset algorithm for computing approximation to the set of vertices and argue that not only there are regimes where AVTA outperforms that algorithm but it can also be used as a pre-processing step to their algorithm. Thus the two algorithms in fact complement each other.
Keywords: Convex hull membership; Approximation algorithms; Machine learning; Linear programming; Random projections (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://link.springer.com/10.1007/s10479-020-03557-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:295:y:2020:i:1:d:10.1007_s10479-020-03557-0
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479
DOI: 10.1007/s10479-020-03557-0
Access Statistics for this article
Annals of Operations Research is currently edited by Endre Boros
More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().