Communities in Streaming Graphs: Small Space Data Structure, Benchmark
Data Generation, and Linear Algorithm
Abstract
Identifying and preserving community structures in a streaming graph is
a very challenging task. However, many applications require the
identification of these communities in very limited space and time. In
this paper, we design Community Sketch, a small space data structure
that efficiently preserves communities. On query, it provides
communities in constant time. With the use of community sketch data
structure, a linear streaming community detection algorithm is proposed.
Experimental results on the large real-world networks show that our
algorithm outperforms other state-of-the-art algorithms in terms of
quality metrics (NMI, F1-score, and WCC). Further, we propose an
algorithm to produce benchmark network, namely, Temporal Community
Benchmark Dataset (TCBD) which contains both true community labels and
temporal information of edges. These synthetic networks are used to
validate the proposed algorithm