问题
I have ASCII data and i need to cluster the data using HDBSCAN. I got the lables but i don't know how to print the output cluster results i.e unique and segregated results from hdbscan.
snippet:
import hdbscan
import numpy as np
datafile = "ascii.txt"
data = np.loadtxt(datafile, dtype = np.uint8)
clusterer = hdbscan.HDBSCAN(min_cluster_size = 20)
clusterer.fit(data)
print (np.unique(clusterer.labels_, return_counts = True))
回答1:
You can use Pandas to read the file and then print out the cluster labels along with the dataset you have as the input. Try something like:
import pandas as pd
df = pd.read_csv("ascii.txt")
clusterer = hdbscan.HDBSCAN().fit_predict(df.ColumnName)
df_pd = pd.DataFrame({'Datapoints:' df.ColumnName, 'Cluster Labels:' clusterer)
回答2:
import hdbscan
import numpy as np
datafile = "ascii.txt"
data = np.loadtxt(datafile, dtype = np.uint8)
Modified_data=pd.DataFrame(data)
clusterer = hdbscan.HDBSCAN(min_cluster_size = 20)
clusterer.fit(Modified_data)
Modified_data['Clusters']=clusterer.labels_
Now Modified_data returns a pandas dataframe where you have a column named "Clusters" and cluster corresponding to each instance will be specified in the Clusters column. You can manipulate this dataframe as per your requirement
来源:https://stackoverflow.com/questions/55609827/how-to-print-output-results-in-hdbscan