k-means

Kmeans clustering identifying knowledge in R

两盒软妹~` 提交于 2019-12-23 04:37:15
问题 I am new to R and the clustering world. I am using a shopping dataset to extract features from it in order to identify something meaningful. So far I have managed to learn how to merge files, remove na., do the sum of errors squared, workout the mean values, summarise by group, do the K means clustering and plot the results X, Y. However, I am very confused on how to view these results or identify what would be a useful cluster? Am i repeating something or missing out on something? I get

K-means cluster plot [closed]

不打扰是莪最后的温柔 提交于 2019-12-23 02:04:39
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 6 years ago . I have a data matrix of 510x6 and want to perform K-means cluster analysis on this. I am having problem in plotting all the different clusters in 2 dimensions. Is it not possible to plot 6 different clusters in 2 dimensions? 回答1: Let's start by looking at some data which is 150x4 and try and split

Using K-means clustering pixel in OpenCV using Java

吃可爱长大的小学妹 提交于 2019-12-22 19:06:10
问题 I am currently trying to develop an Android app. I have tried to convert an image of a leaf from RBG to HSV to produce an image which is in saturation-value space (without hue). Next, I tried to use K-means to produce a image where it should display blue as background and green for the leaf (foreground object). However, I do not know how to display the image after using the K-means function in OpenCV. Imgproc.cvtColor(rgba, mHSV, Imgproc.COLOR_RGBA2RGB,3); Imgproc.cvtColor(rgba, mHSV, Imgproc

how to set initial centers of K-means openCV c++

你。 提交于 2019-12-22 13:47:29
问题 I am trying to do a segmentation of an image using OpenCv and Kmeans, the code that I have just implemented is the following: #include "opencv2/objdetect/objdetect.hpp" #include "opencv2/highgui/highgui.hpp" #include "opencv2/imgproc/imgproc.hpp" #include <iostream> #include <stdio.h> using namespace std; using namespace cv; int main(int, char** argv) { Mat src, Imagen2, Imagris, labels, centers,imgfondo; src = imread("C:/Users/Sebastian/Documents/Visual Studio 2015/Projects/ClusteringImage

How to segment new data with existing K-means model?

大憨熊 提交于 2019-12-22 12:34:07
问题 I have built a segmentation model using k-means clustering. Could anybody describe the process for assigning new data into these segments? Currently I am applying the same transformations/standardisations/outliers as I did to build the model and then calculating the euclidean distance. The minimum distance is the segment that record falls into. But, I am seeing the majority fall into 1 particular segment and I am wondering if I have missed something along the way? Thanks 回答1: Classifying a

Python K means clustering

☆樱花仙子☆ 提交于 2019-12-22 11:23:06
问题 I am trying to implement the code on this website to estimate what value of K I should use for my K means clustering. https://datasciencelab.wordpress.com/2014/01/21/selection-of-k-in-k-means-clustering-reloaded/ However I am not getting any success - in particular I am trying to get the f(k) vs the number of clusters k graph which I can use to procure the ideal value of k to use. My data format is as follows: Each of the coordinates have 5 dimensions/variables i.e. they are data points that

unstable result from scipy.cluster.kmeans

99封情书 提交于 2019-12-22 10:07:45
问题 The following code gives different results at every runtime while clustering the data into 3 parts using the k means method: from numpy import array from scipy.cluster.vq import kmeans,vq data = array([1,1,1,1,1,1,3,3,3,3,3,3,7,7,7,7,7,7]) centroids = kmeans(data,3,100) #with 100 iterations print (centroids) Three possible results obtained were: (array([1, 3, 7]), 0.0) (array([3, 7, 1]), 0.0) (array([7, 3, 1]), 0.0) Actually, the order of the calculated k means are different. But, does not it

Cluster unseen points using Spectral Clustering

…衆ロ難τιáo~ 提交于 2019-12-22 09:55:29
问题 I am using Spectral Clustering method to cluster my data. The implementation seems to work properly. However, I have one problem - I have a set of unseen points (not present in the training set) and would like to cluster these based on the centroids derived by k-means (Step 5 in the paper). However, the k-means is computed on the k eigenvectors and therefore the centroids are low-dimensional. Does any-one knows a method that can be used to map an unseen point to a low-dimension and compute

Spark MLlib / K-Means intuition

此生再无相见时 提交于 2019-12-22 08:42:31
问题 I'm very new to machine learning algorithms and Spark. I'm follow the Twitter Streaming Language Classifier found here: http://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/README.html Specifically this code: http://databricks.gitbooks.io/databricks-spark-reference-applications/content/twitter_classifier/scala/src/main/scala/com/databricks/apps/twitter_classifier/ExamineAndTrain.scala Except I'm trying to run it in batch mode on some tweets it pulls

How to pick the the T1 and T2 threshold values for Canopy Clustering?

谁都会走 提交于 2019-12-22 04:27:12
问题 I am trying to implement the Canopy clustering algorithm along with K-Means. I've done some searching online that says to use Canopy clustering to get your initial starting points to feed into K-means, the problem is, in Canopy clustering, you need to specify 2 threshold values for the canopy: T1 and T2, where points in the inner threshold are strongly tied to that canopy and the points in the wider threshold are less tied to that canopy. How are these threshold, or distances from the canopy