Minimized-cost cube query on heterogeneous information networks
Dan Yin (),
Hong Gao (),
Zhaonian Zou () and
Jianzhong Li ()
Additional contact information
Dan Yin: Harbin Institute of Technology
Hong Gao: Harbin Institute of Technology
Zhaonian Zou: Harbin Institute of Technology
Jianzhong Li: Harbin Institute of Technology
Journal of Combinatorial Optimization, 2017, vol. 33, issue 1, No 25, 339-364
Abstract:
Abstract Data cube is the foundation of on-line analytical processing (OLAP), which can provide users with data views from different perspectives and granularities. Heterogeneous information networks consist of multiple types of nodes and edges which represent different semantic relations. With the rapid development of social networks and knowledge graphs, heterogeneous information networks have become increasingly popular. In heterogeneous information networks, cube is the set of aggregate graphs and cube query is required for supporting OLAP. The existing research mainly studies aggregate graph query on homogeneous networks, but only considers the attributes of nodes. To overcome these challenges, this paper investigates cube query problem on heterogeneous information networks. (1) A novel cube model for heterogeneous information networks is proposed, which captures both the attribute and structure semantics. (2) Because the total number of aggregate graphs is huge, computing and storing them cost plenty of time and storage. The problem of partial cube materialization on heterogeneous information networks is investigated. Given a fixed size of memory space, select a subset of aggregate graphs in cube, to minimize the computing cost of the whole cube. This optimization problem is proved to be NP-complete and there is no $$n^{1-{\epsilon }}$$ n 1 - ϵ approximation algorithm unless P $$=$$ = NP. (3) A greedy algorithm is proposed for partial cube materialization based on two interesting dependencies between aggregate graphs, attribute dependence and path dependence. (4) Experiments on real world data sets show the cube definition is meaningful, and the partial cube materialization algorithm is efficient.
Keywords: Heterogeneous information network; OLAP; Aggregate graph; Cube; Materialization (search for similar items in EconPapers)
Date: 2017
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10878-015-9967-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:jcomop:v:33:y:2017:i:1:d:10.1007_s10878-015-9967-6
Ordering information: This journal article can be ordered from
https://www.springer.com/journal/10878
DOI: 10.1007/s10878-015-9967-6
Access Statistics for this article
Journal of Combinatorial Optimization is currently edited by Thai, My T.
More articles in Journal of Combinatorial Optimization from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().