How can i find the mean distance from the centroid to all the data points in each cluster. I am able to find the euclidean distance of each point (in my dataset) from the ce
alphaleonis gave nice answer. For the general case of n dimentions here is some a changes needed for his answer:
def k_mean_distance(data, cantroid_matrix, i_centroid, cluster_labels):
# Calculate Euclidean distance for each data point assigned to centroid
distances = [np.linalg.norm(x-cantroid_matrix) for x in data[cluster_labels == i_centroid]]
# return the mean value
return np.mean(distances)
for i, cent_features in enumerate(centroids):
mean_distance = k_mean_distance(emb_matrix, centroid_matrix, i, kmeans_clusters)
c_mean_distances.append(mean_distance)