k-means | 易学教程

Kmeans clustering identifying knowledge in R

阅读更多关于 Kmeans clustering identifying knowledge in R

问题 I am new to R and the clustering world. I am using a shopping dataset to extract features from it in order to identify something meaningful. So far I have managed to learn how to merge files, remove na., do the sum of errors squared, workout the mean values, summarise by group, do the K means clustering and plot the results X, Y. However, I am very confused on how to view these results or identify what would be a useful cluster? Am i repeating something or missing out on something? I get

K-means cluster plot [closed]

阅读更多关于 K-means cluster plot [closed]

问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 6 years ago . I have a data matrix of 510x6 and want to perform K-means cluster analysis on this. I am having problem in plotting all the different clusters in 2 dimensions. Is it not possible to plot 6 different clusters in 2 dimensions? 回答1: Let's start by looking at some data which is 150x4 and try and split

Using K-means clustering pixel in OpenCV using Java

阅读更多关于 Using K-means clustering pixel in OpenCV using Java

问题 I am currently trying to develop an Android app. I have tried to convert an image of a leaf from RBG to HSV to produce an image which is in saturation-value space (without hue). Next, I tried to use K-means to produce a image where it should display blue as background and green for the leaf (foreground object). However, I do not know how to display the image after using the K-means function in OpenCV. Imgproc.cvtColor(rgba, mHSV, Imgproc.COLOR_RGBA2RGB,3); Imgproc.cvtColor(rgba, mHSV, Imgproc

how to set initial centers of K-means openCV c++

阅读更多关于 how to set initial centers of K-means openCV c++

问题 I am trying to do a segmentation of an image using OpenCv and Kmeans, the code that I have just implemented is the following: #include "opencv2/objdetect/objdetect.hpp" #include "opencv2/highgui/highgui.hpp" #include "opencv2/imgproc/imgproc.hpp" #include <iostream> #include <stdio.h> using namespace std; using namespace cv; int main(int, char** argv) { Mat src, Imagen2, Imagris, labels, centers,imgfondo; src = imread("C:/Users/Sebastian/Documents/Visual Studio 2015/Projects/ClusteringImage

How to segment new data with existing K-means model?

阅读更多关于 How to segment new data with existing K-means model?

问题 I have built a segmentation model using k-means clustering. Could anybody describe the process for assigning new data into these segments? Currently I am applying the same transformations/standardisations/outliers as I did to build the model and then calculating the euclidean distance. The minimum distance is the segment that record falls into. But, I am seeing the majority fall into 1 particular segment and I am wondering if I have missed something along the way? Thanks 回答1: Classifying a

Python K means clustering

阅读更多关于 Python K means clustering

问题 I am trying to implement the code on this website to estimate what value of K I should use for my K means clustering. https://datasciencelab.wordpress.com/2014/01/21/selection-of-k-in-k-means-clustering-reloaded/ However I am not getting any success - in particular I am trying to get the f(k) vs the number of clusters k graph which I can use to procure the ideal value of k to use. My data format is as follows: Each of the coordinates have 5 dimensions/variables i.e. they are data points that

unstable result from scipy.cluster.kmeans

阅读更多关于 unstable result from scipy.cluster.kmeans

问题 The following code gives different results at every runtime while clustering the data into 3 parts using the k means method: from numpy import array from scipy.cluster.vq import kmeans,vq data = array([1,1,1,1,1,1,3,3,3,3,3,3,7,7,7,7,7,7]) centroids = kmeans(data,3,100) #with 100 iterations print (centroids) Three possible results obtained were: (array([1, 3, 7]), 0.0) (array([3, 7, 1]), 0.0) (array([7, 3, 1]), 0.0) Actually, the order of the calculated k means are different. But, does not it

Cluster unseen points using Spectral Clustering

阅读更多关于 Cluster unseen points using Spectral Clustering

问题 I am using Spectral Clustering method to cluster my data. The implementation seems to work properly. However, I have one problem - I have a set of unseen points (not present in the training set) and would like to cluster these based on the centroids derived by k-means (Step 5 in the paper). However, the k-means is computed on the k eigenvectors and therefore the centroids are low-dimensional. Does any-one knows a method that can be used to map an unseen point to a low-dimension and compute

Spark MLlib / K-Means intuition

阅读更多关于 Spark MLlib / K-Means intuition

问题 I'm very new to machine learning algorithms and Spark. I'm follow the Twitter Streaming Language Classifier found here: http://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/README.html Specifically this code: http://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/scala/src/main/scala/com/databricks/apps/twitter_classifier/ExamineAndTrain.scala Except I'm trying to run it in batch mode on some tweets it pulls

How to pick the the T1 and T2 threshold values for Canopy Clustering?

阅读更多关于 How to pick the the T1 and T2 threshold values for Canopy Clustering?

问题 I am trying to implement the Canopy clustering algorithm along with K-Means. I've done some searching online that says to use Canopy clustering to get your initial starting points to feed into K-means, the problem is, in Canopy clustering, you need to specify 2 threshold values for the canopy: T1 and T2, where points in the inner threshold are strongly tied to that canopy and the points in the wider threshold are less tied to that canopy. How are these threshold, or distances from the canopy