Extracting clusters from seaborn clustermap

前端 未结 2 933
失恋的感觉
失恋的感觉 2020-12-23 14:45

I am using the seaborn clustermap to create clusters and visually it works great (this example produces very similar results).

However I am having troub

2条回答
  •  情话喂你
    2020-12-23 15:05

    You probably want a new column in your dataframe with the cluster membership. I've managed to do this from assembled snippets of code stolen from all over the web:

    import seaborn
    import scipy
    
    g = seaborn.clustermap(df,method='average')
    den = scipy.cluster.hierarchy.dendrogram(g.dendrogram_col.linkage,
                                             labels = df.index,
                                             color_threshold=0.60)  
    from collections import defaultdict
    
    def get_cluster_classes(den, label='ivl'):
        cluster_idxs = defaultdict(list)
        for c, pi in zip(den['color_list'], den['icoord']):
            for leg in pi[1:3]:
                i = (leg - 5.0) / 10.0
                if abs(i - int(i)) < 1e-5:
                    cluster_idxs[c].append(int(i))
    
        cluster_classes = {}
        for c, l in cluster_idxs.items():
            i_l = [den[label][i] for i in l]
            cluster_classes[c] = i_l
    
        return cluster_classes
    
    clusters = get_cluster_classes(den)
    
    cluster = []
    for i in df.index:
        included=False
        for j in clusters.keys():
            if i in clusters[j]:
                cluster.append(j)
                included=True
        if not included:
            cluster.append(None)
    
    df["cluster"] = cluster
    

    So this gives you a column with 'g' or 'r' for the green- or red-labeled clusters. I determine my color_threshold by plotting the dendrogram, and eyeballing the y-axis values.

提交回复
热议问题