document.pdf (7.29 MB)
Download fileVDPC: Variational Density Peak Clustering Algorithm
preprint
posted on 2021-12-29, 07:10 authored by Yizhang WangYizhang Wang, Di WangDi Wang, You Zhou, Chai Quek, Xiaofeng ZhangClustering is an important unsupervised knowledge acquisition method, which divides the unlabeled data into different groups \cite{atilgan2021efficient,d2021automatic}. Different clustering algorithms make different assumptions on the cluster formation, thus, most clustering algorithms are able to well handle at least one particular type of data distribution but may not well handle the other types of distributions. For example, K-means identifies convex clusters well \cite{bai2017fast}, and DBSCAN is able to find clusters with similar densities \cite{DBSCAN}.
Therefore, most clustering methods may not work well on data distribution patterns that are different from the assumptions being made and on a mixture of different distribution patterns. Taking DBSCAN as an example, it is sensitive to the loosely connected points between dense natural clusters as illustrated in Figure~\ref{figconnect}. The density of the connected points shown in Figure~\ref{figconnect} is different from the natural clusters on both ends, however, DBSCAN with fixed global parameter values may wrongly assign these connected points and consider all the data points in Figure~\ref{figconnect} as one big cluster.
History
Email Address of Submitting Author
wyzhang_new@sina.comORCID of Submitting Author
0000-0002-0687-7802.Submitting Author's Institution
Yangzhou UniversitySubmitting Author's Country
- China