How do I find which cluster my data belongs to using Python?

[亡魂溺海] 提交于 2019-12-02 16:09:34

问题


I just ran PCA and then K-means Clustering algorithm on my data, after running the algorithm I get 3 clusters. I am trying to figure out which clusters my input belongs to , in order to gather some qualitative attributes about the input. My input is customer ID and the variables I used for clustering were the spend patterns on certain products

Below is the code I ran for K means, looking for some inputs on how to map this back to the source data to see which cluster the input belongs to :

kmeans= KMeans(n_clusters=3)
X_clustered=kmeans.fit_predict(x_10d)

LABEL_COLOR_MAP = {0:'r', 1 : 'g' ,2 : 'b'}
label_color=[LABEL_COLOR_MAP[l] for l in X_clustered]

#plot the scatter diagram

plt.figure(figsize=(7,7))
plt.scatter(x_10d[:,0],x_10d[:,2] , c=label_color, alpha=0.5)
plt.show()

Thanks


回答1:


If you want to add the cluster labels back in your dataframe, and assuming x_10d is your dataframe, you can do:

x_10d["cluster"] = X_clustered

This will add a new column in your dataframe called "cluster" which should contain the cluster label for each of your rows.




回答2:


To group instances by their assigned cluster id

N_CLUSTERS = 3
clusters = [x_10d[X_clustered == i] for i in range(N_CLUSTERS)]
# replace x_10d with where you want to retrieve data

# to have a look
for i, c in enumerate(clusters):
    print('Cluster {} has {} members: {}...'.format(i, len(c), c[0]))

# which prints
# Cluster 0 has 37 members: [0.95690664 0.07578273 0.0094432 ]...
# Cluster 1 has 30 members: [0.03124354 0.97932615 0.47270528]...
# Cluster 2 has 33 members: [0.26331688 0.5039502  0.72568873]...


来源:https://stackoverflow.com/questions/50146570/how-do-i-find-which-cluster-my-data-belongs-to-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!