How To Explain Clustering Results?
Say I have a high dimensional dataset which I assume to be well separable by some kind of clustering algorithm. And I run the algorithm and end up with my clusters. Is there any so
Solution 1:
Have you tried using PCA or some other dimensionality reduction techniques and checking whether the clusters still hold? Sometimes relationships still exist in lower dimensions (Caveat: it doesn't always help one's understanding of the data). Cool article about visualizing MNIST data. http://colah.github.io/posts/2014-10-Visualizing-MNIST/. I hope this helps a bit.
Solution 2:
Do not treat the clustering algorithm as a black box.
Yes, k-means uses centroids. But most algorithms for high-dimensional data don't (and don't use k-means!). Instead, they will often select some features, projections, subspaces, manifolds, etc. So look at what information the actual clustering algorithm provides!
Post a Comment for "How To Explain Clustering Results?"