Estimating the Number of Clusters via Normalized Cluster Instability
| Authors |
|
|---|---|
| Publication date | 12-2020 |
| Journal | Computational Statistics |
| Volume | Issue number | 35 | 4 |
| Pages (from-to) | 1879–1894 |
| Number of pages | 16 |
| Organisations |
|
| Abstract |
We improve instability-based methods for the selection of the number of clusters k in cluster analysis by developing a corrected clustering distance that corrects for the unwanted influence of the distribution of cluster sizes on cluster instability. We show that our corrected instability measure outperforms current instability-based measures across the whole sequence of possible k, overcoming limitations of current insability-based methods for large k. We also compare, for the first time, model-based and model-free approaches to determining cluster-instability and find their performance to be comparable. We make our method available in the R-package cstab.
|
| Document type | Article |
| Language | English |
| Published at | https://doi.org/10.1007/s00180-020-00981-5 |
| Downloads |
Haslbeck-Wulff2020_Article_EstimatingTheNumberOfClustersV
(Final published version)
|
| Permalink to this page | |