loading page

One-shot Federated K-means Clustering based on Density Cores
  • +1
  • Yizhang Wang ,
  • Wei Pang,
  • Di Wang,
  • Witold Pedrycz
Yizhang Wang

Corresponding Author:[email protected]

Author Profile
Wei Pang
Di Wang
Witold Pedrycz

Abstract

Federated clustering (FC) is an emerging and important topic that clustering all the data from many different heterogeneous clients/devices while prohibiting clients from sharing raw data. However, for existing works, there are some problems: (1) federated learning performs well in independent and identically distributed (IID) scenarios, but for Non-IID scenarios (Non-Identical class distribution), it is hard to collaboratively train a clustering algorithm based on global similarity measure while keeping all data local. (2) Some federated clustering algorithms have good performance, but their communication costs are high. It is difficult to balance communication costs and clustering effectiveness. In this paper, we propose new federated k-means clustering framework to solve the above two problems and balance communication costs and clustering effectiveness. (1) For the clients, we use cluster centers (representative points) genearate by K-means to represent the corresponding clusters because these representative points form density backbone of clusters and can effectively preserve the structure of the local data. (2) For the server, we propose two methods to reprocess these uploaded encrypted representative points to obtain better final cluster centers, one uses K-means and the other takes the improved density peaks (density cores) as final centers, and then send them back to the clients. Finally, each client assign local data to the nearest centers. The experimental results show that the proposed method performs better than some centralized (non-federated) classical clustering algorithms (K-means, DBSCAN and density peak clustering) and state-of-the-art (SOTA) centralized clustering algorithms in most cases. In particular, the proposed algorithms performs better than SOTA federated clustering framework k-FED (ICML2021) and MUFC (ICLR2023).
16 Jan 2024Submitted to TechRxiv
26 Jan 2024Published in TechRxiv