TY - JOUR
T1 - Density peaks clustering based on Gaussian fuzzy neighborhood with noise parameter
AU - Waqas, Syed Muhammad
AU - Khan, Sumra
AU - Talpur, Kashif
AU - Khan, Rizwan Ahmed
PY - 2024/7/14
Y1 - 2024/7/14
N2 - Density peak clustering (DPC) is an effective clustering method known for its robustness, non-iterative nature, and hybrid approach. However, it is not without limitations: (a) the determination of the cutoff distance (dc) relies on human experience, which can significantly impact the clustering outcome; (b) DPC does not take into account the local structure of the data when computing local densities; (c) it employs a crisp kernel for density computation; (d) the performance of DPC is affected by chain reactions; and (e) DPC often struggles to handle noisy data. In order to address these limitations, this paper proposes a novel approach called DPC based on Gaussian fuzzy neighborhood with noise parameter (DPC-GFNN). The proposed method leverages a Gaussian fuzzy kernel to enhance the separation between clusters and mitigates the influence of outliers through an adjustable noise parameter (λ). DPC-GFNN utilizes a k-nearest neighbor graph based on density to label highly dense regions. This technique effectively avoids chain reactions by assigning accurate labels to points in border areas, enabling proper clustering of data with diverse shapes and densities. To evaluate the effectiveness of DPC-GFNN, a series of experiments are conducted on both real-world and synthetic datasets. The experimental results demonstrate that DPC-GFNN exhibits superior robustness and clustering accuracy compared to other modified variants of DPC including DPC based on k-nearest neighbors (DPC-KNN), improved DPC (IDPC), DPC based on density backbone and fuzzy neighborhood (DPC-DBFN), and DPC based on fuzzy weighted k-nearest neighbors (FKNN).
AB - Density peak clustering (DPC) is an effective clustering method known for its robustness, non-iterative nature, and hybrid approach. However, it is not without limitations: (a) the determination of the cutoff distance (dc) relies on human experience, which can significantly impact the clustering outcome; (b) DPC does not take into account the local structure of the data when computing local densities; (c) it employs a crisp kernel for density computation; (d) the performance of DPC is affected by chain reactions; and (e) DPC often struggles to handle noisy data. In order to address these limitations, this paper proposes a novel approach called DPC based on Gaussian fuzzy neighborhood with noise parameter (DPC-GFNN). The proposed method leverages a Gaussian fuzzy kernel to enhance the separation between clusters and mitigates the influence of outliers through an adjustable noise parameter (λ). DPC-GFNN utilizes a k-nearest neighbor graph based on density to label highly dense regions. This technique effectively avoids chain reactions by assigning accurate labels to points in border areas, enabling proper clustering of data with diverse shapes and densities. To evaluate the effectiveness of DPC-GFNN, a series of experiments are conducted on both real-world and synthetic datasets. The experimental results demonstrate that DPC-GFNN exhibits superior robustness and clustering accuracy compared to other modified variants of DPC including DPC based on k-nearest neighbors (DPC-KNN), improved DPC (IDPC), DPC based on density backbone and fuzzy neighborhood (DPC-DBFN), and DPC based on fuzzy weighted k-nearest neighbors (FKNN).
U2 - 10.1016/j.eswa.2024.124782
DO - 10.1016/j.eswa.2024.124782
M3 - Article
SN - 0957-4174
VL - 255
SP - 124782
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -