I am new to both python and scikit-learn so please bear with me.
I took this source code for k means clustering algorithm from k means clustering.
I then modif
dataset.filenames
is the key :)
This is how i did it.
load_files declaration is :
def load_files(container_path, description=None, categories=None,
load_content=True, shuffle=True, charset=None,
charse_error='strict', random_state=0)
so do
dataset_files = load_files("path_to_directory_containing_category_folders");
then when i got the result :
i put them in the clusters which is a dictionary
clusters = defaultdict(list)
k = 0;
for i in km.labels_ :
clusters[i].append(dataset_files.filenames[k])
k += 1
and then i print it :)
for clust in clusters :
print "\n************************\n"
for filename in clusters[clust] :
print filename