how to print output results in HDBSCAN

笑着哭i 提交于 2019-12-11 18:26:45

问题


I have ASCII data and i need to cluster the data using HDBSCAN. I got the lables but i don't know how to print the output cluster results i.e unique and segregated results from hdbscan.

snippet:

import hdbscan
import numpy as np

datafile = "ascii.txt"

data = np.loadtxt(datafile, dtype = np.uint8)

clusterer = hdbscan.HDBSCAN(min_cluster_size = 20)

clusterer.fit(data)

print (np.unique(clusterer.labels_, return_counts = True))

回答1:


You can use Pandas to read the file and then print out the cluster labels along with the dataset you have as the input. Try something like:

import pandas as pd
df = pd.read_csv("ascii.txt")
clusterer = hdbscan.HDBSCAN().fit_predict(df.ColumnName)
df_pd = pd.DataFrame({'Datapoints:' df.ColumnName, 'Cluster Labels:' clusterer)



回答2:


import hdbscan

import numpy as np

datafile = "ascii.txt"

data = np.loadtxt(datafile, dtype = np.uint8)

Modified_data=pd.DataFrame(data)

clusterer = hdbscan.HDBSCAN(min_cluster_size = 20)

clusterer.fit(Modified_data)

Modified_data['Clusters']=clusterer.labels_

Now Modified_data returns a pandas dataframe where you have a column named "Clusters" and cluster corresponding to each instance will be specified in the Clusters column. You can manipulate this dataframe as per your requirement



来源:https://stackoverflow.com/questions/55609827/how-to-print-output-results-in-hdbscan

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!