Estimating the Number of Clusters via Normalized Cluster Instability

Open Access
Authors
Publication date 12-2020
Journal Computational Statistics
Volume | Issue number 35 | 4
Pages (from-to) 1879–1894
Number of pages 16
Organisations
  • Faculty of Social and Behavioural Sciences (FMG) - Psychology Research Institute (PsyRes)
Abstract
We improve instability-based methods for the selection of the number of clusters k in cluster analysis by developing a corrected clustering distance that corrects for the unwanted influence of the distribution of cluster sizes on cluster instability. We show that our corrected instability measure outperforms current instability-based measures across the whole sequence of possible k, overcoming limitations of current insability-based methods for large k. We also compare, for the first time, model-based and model-free approaches to determining cluster-instability and find their performance to be comparable. We make our method available in the R-package cstab.
Document type Article
Language English
Published at https://doi.org/10.1007/s00180-020-00981-5
Downloads
Permalink to this page
Back