Is Triangle inequality necessary for kmeans?
I wonder if Triangle inequality is necessary for the distance measure used in kmeans. k-means is designed for Euclidean distance, which happens to satisfy triangle inequality. Using other distance functions is risky, as it may stop converging . The reason however is not the triangle inequality, but the mean might not minimize the distance function . (The arithmetic mean minimizes the sum-of-squares, not arbitrary distances!) There are faster methods for k-means that exploit the triangle inequality to avoid recomputations. But if you stick to classic MacQueen or Lloyd k-means, then you do not