On the Reliability of Clustering Stability in the Large Sample Regime
Ohad Shamir and Naftali Tishby Poster M25

Clustering Stability
·Used for model selection in clustering ·For the `correct' model, different random samples should lead to similar clusterings.

Problem
·Any model becomes stable for large enough samples! · Do these methods become meaningless when the sample size is large enough? ·NIPS 2007: NO, for 3 Gaussians in 1D with idealized k-means...

NIPS 2008: NO, for general distributions in , and for large families of real-world clustering algorithms. See Poster for details!