FlexGraph: Flexible partitioning and storage for scalable graph mining
Chiwan Park,
Ha-Myung Park and
U Kang
PLOS ONE, 2020, vol. 15, issue 1, 1-25
Abstract:
How can we analyze large graphs such as the Web, and social networks with hundreds of billions of vertices and edges? Although many graph mining systems have been proposed to perform various graph mining algorithms on such large graphs, they have difficulties in processing Web-scale graphs due to massive communication and I/O costs caused by communication between workers, and reading subgraphs repeatedly. In this paper, we propose FlexGraph, a scalable distributed graph mining method reducing the costs by exploiting properties of real-world graphs. FlexGraph significantly decreases the communication cost, which is the main bottleneck of distributed systems, by exploiting different edge placement policies based on types of vertices. Furthermore, we propose a flexible storage format to reduce I/O costs when reading input graph repeatedly. Experiments show that FlexGraph succeeds in processing up to 64× larger graphs than existing distributed memory-based graph mining methods, and consistently outperforms previous disk-based graph mining methods.
Date: 2020
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227032 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 27032&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0227032
DOI: 10.1371/journal.pone.0227032
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().