Abstract
Deep clustering incorporates embedding into clustering to find a
lower-dimensional space appropriate for clustering. Most of the existing
methods try to group similar data points through simultaneously
minimizing clustering and reconstruction losses, employing an
autoencoder (AE). However, they all ignore the relevant useful
information available within pairwise data relationships. In this paper
we propose a novel deep clustering framework with self-supervision using
pairwise data similarities (DCSS). The proposed method consists of two
successive phases. First, we propose a novel AE-based approach that aims
to aggregate similar data points near a common group center in the
latent space of an AE. The AE’s latent space is obtained by minimizing
weighted reconstruction and centering losses of data points, where
weights are defined based on similarity of data points and group
centers. In the second phase, we map the AE’s latent space, using a
fully connected network MNet, onto a K-dimensional space used to derive
the final data cluster assignments, where K is the number of clusters.
MNet is trained to strengthen (weaken) similarity of similar
(dissimilar) samples. Experimental results on multiple benchmark
datasets demonstrate the effectiveness of DCSS for data clustering and
as a general framework for boosting up state-of-the-art clustering
methods.