k-means | 易学教程

Optimizing K-means algorithm

阅读更多关于 Optimizing K-means algorithm

问题 Am trying to follow a paper called An Optimized Version of K-Means Algorithm. I have the idea on how K-Means algorithm works. That is, grouping the tuples/points into clusters and updating the centroids. Am trying to implement the method mentioned in the above paper. Their proposed algorithm is this: So my doubt is in the second step. I didn't understood what it is being done there! In the paper it says that, we group our data to wider intervals based on the value of e , so that later we

K means clustering in MATLAB - output image

阅读更多关于 K means clustering in MATLAB - output image

问题 To perform K means clustering with k = 3 (segments). So I: 1) Converted the RGB img into grayscale 2) Casted the original image into a n X 1, column matrix 3) idx = kmeans(column_matrix) 4) output = idx, casted back into the same dimensions as the original image. My questions are : A When I do imshow(output), I get a plain white image. However when I do imshow(output[0 5]), it shows the output image. I understand that 0 and 5 specify the display range. But why do I have to do this? B) Now the

Getting an IOException when running a sample code in “Mahout in Action” on mahout-0.6

阅读更多关于 Getting an IOException when running a sample code in “Mahout in Action” on mahout-0.6

问题 I'm learning Mahout and reading "Mahout in Action". When I tried to run the sample code in chapter7 SimpleKMeansClustering.java, an exception popped up: Exception in thread "main" java.io.IOException: wrong value class: 0.0: null is not class org.apache.mahout.clustering.WeightedPropertyVectorWritable at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874) at SimpleKMeansClustering.main(SimpleKMeansClustering.java:95) I successed this code on mahout-0.5, but on mahout-0.6 I

K means finding elbow when the elbow plot is a smooth curve

阅读更多关于 K means finding elbow when the elbow plot is a smooth curve

问题 I am trying to plot the elbow of k means using the below code: load CSDmat %mydata for k = 2:20 opts = statset('MaxIter', 500, 'Display', 'off'); [IDX1,C1,sumd1,D1] = kmeans(CSDmat,k,'Replicates',5,'options',opts,'distance','correlation');% kmeans matlab [yy,ii] = min(D1'); %% assign points to nearest center distort = 0; distort_across = 0; clear clusts; for nn=1:k I = find(ii==nn); %% indices of points in cluster nn J = find(ii~=nn); %% indices of points not in cluster nn clusts{nn} = I; %%

OpenCV K-Means (kmeans2)

阅读更多关于 OpenCV K-Means (kmeans2)

问题 I'm using Opencv's K-means implementation to cluster a large set of 8-dimensional vectors. They cluster fine, but I can't find any way to see the prototypes created by the clustering process. Is this even possible? OpenCV only seems to give access to the cluster indexes (or labels). If not I guess it'll be time to make my own implementation! 回答1: I can't say I used OpenCV's implementation of Kmeans, but if you have access to the labels given to each instance, you can simply get the centroids

k means clustering algorithm

阅读更多关于 k means clustering algorithm

问题 I want to perform a k means clustering analysis on a set of 10 data points that each have an array of 4 numeric values associated with them. I'm using the Pearson correlation coefficient as the distance metric. I did the first two steps of the k means clustering algorithm which were: 1) Select a set of initial centres of k clusters. [I selected two initial centres at random] 2) Assign each object to the cluster with the closest centre. [I used the Pearson correlation coefficient as the

Use Absolute Pearson Correlation as Distance in K-Means Algorithm (MATLAB)

阅读更多关于 Use Absolute Pearson Correlation as Distance in K-Means Algorithm (MATLAB)

问题 I need to do some clustering using a correlation distance but instead of using the built-in 'distance' 'correlation' which is defined as d=1-r i need the absolute pearson distance.In my aplication anti-correlated data should get the same cluter ID. And now when using the kmeans() function im getting centroids that are highly anticorreleted wich i would like to avoid by combineing them. Now, im not that fluent in matlab yet and have some problems reading the kmeans function. Would it be

How to set k-Means clustering labels from highest to lowest with Python?

阅读更多关于 How to set k-Means clustering labels from highest to lowest with Python?

问题 I have a dataset of 38 apartments and their electricity consumption in the morning, afternoon and evening. I am trying to clusterize this dataset using the k-Means implementation from scikit-learn, and am getting some interesting results. First clustering results: This is all very well, and with 4 clusters I obviously get 4 labels associated to each apartment - 0, 1, 2 and 3. Using the random_state parameter of KMeans method, I can fix the seed in which the centroids are randomly initialized,

Plotting the boundaries of cluster zone in Python with scikit package

阅读更多关于 Plotting the boundaries of cluster zone in Python with scikit package

问题 Here is my simple example of dealing with data clustering in 3 attribute(x,y,value). each sample represent its location(x,y) and its belonging variable. My code was post here: x = np.arange(100,200,1) y = np.arange(100,200,1) value = np.random.random(100*100) xx,yy = np.meshgrid(x,y) xx = xx.reshape(100*100) yy = yy.reshape(100*100) j = np.dstack((xx,yy,value))[0,:,:] fig = plt.figure(figsize =(12,4)) ax1 = plt.subplot(121) xi,yi = np.meshgrid(x,y) va = value.reshape(100,100) pc = plt

What's the difference between kmeans and kmeans2 in scipy?

阅读更多关于 What's the difference between kmeans and kmeans2 in scipy?

问题 I am new to machine learning and wondering the difference between kmeans and kmeans2 in scipy. According to the doc both of them are using the 'k-means' algorithm, but how to choose them? 回答1: Based on the documentation, it seems kmeans2 is the standard k-means algorithm and runs until converging to a local optimum - and allows you to change the seed initialization. The kmeans function will terminate early based on a lack of change, so it may not even reach a local optimum. Further, the goal