k-means

OpenCV's clustering function cvKMeans2() - why doesnt work when i use the centers parameter

安稳与你 提交于 2019-12-25 12:40:13
问题 i use this code. its should print the clusters labels an then the centroids. but the 'center"matrix with the centriod seems to be empty,full of zeros. what is wrong my friends? #include <iostream> #include <stdio.h> #include "cxcore.h" #include "highgui.h" using namespace cv; int main( int argc, char** argv ) { int i,j; CvMat* points = cvCreateMat( 5, 2, CV_32FC1 ); CvMat* centers2 = cvCreateMat( 5, 2, CV_32FC1 ); CvMat* clusters = cvCreateMat( 5, 1, CV_32SC1 ); cvSetReal2D( points, 0, 0,1);

Bag of Visual Words: what is a reasonable word (vector) dimension?

半世苍凉 提交于 2019-12-25 07:59:59
问题 In the Bag of Features/Visual Words paradigm we have a vector V in k -dimensions, where V[i]=j if the i -th centroid (obtained by k -means algorithm) is the closest one among all the k -centroids for j visual descriptors (e.g. SIFT descriptors). AFAIK, the resulting visual vector is very sparse (it means that most of entries are 0-value) since k is really big, but my question is: what is a reasonable value for k (and so the vector size)? Hundreds of dimensions? Thousands? Especially

Understanding output from kmeans clustering in python

余生长醉 提交于 2019-12-25 07:53:46
问题 I have two distance matrices, each 232*232 where the column and row labels are identical. So this would be an abridged version of the two where A, B, C and D are the names of the points between which the distances are measured: A B C D ... A B C D ... A 0 1 5 3 A 0 5 3 9 B 4 0 4 1 B 2 0 7 8 C 2 6 0 3 C 2 6 0 1 D 2 7 1 0 D 5 2 5 0 ... ... The two matrices therefore represent the distances between pairs of points in two different networks. I want to identify clusters of pairs that are close

OpenCV kmeans: N>=K exception, error (-215)

我的梦境 提交于 2019-12-25 04:44:07
问题 when I try to use kmeans as such: int K = 4; Mat labels; Mat centers; std::vector<float> values; // (put a bunch of values into "values" here...) kmeans(values, K, labels, TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 10, 1.0), 10, KMEANS_PP_CENTERS, centers); I get the error: "error: (-215) N >= K in function kmeans" values.size() = 360000, so N is clearly greater than K. What gives? Thanks. 回答1: OpenCV weirdly interprets one-dimensional data as a 1 element array. Something like

Displaying kmean result with specific colors to specific clusters

落爺英雄遲暮 提交于 2019-12-25 02:55:23
问题 I applied k-mean clustering on a preprocessed image using the following matlab code %B - input image C=rgb2gray(B); [idx centroids]=kmeans(double(C(:)),4); imseg = zeros(size(C,1),size(C,2)); for i=1:max(idx) imseg(idx==i)=i; end i=mat2gray(imseg); % i - output image Every time I display the output, color assigned to the output images changes. How can I give a specific color to cluster1, cluster2, cluster3 and cluster4. 回答1: You can use a colormap. Let R1 , B1 and G1 be the RGB values you

Sklearn MiniBatchKMeans gives confusing results for labels_ attribute

南笙酒味 提交于 2019-12-24 21:59:00
问题 I am using sklearn.cluster.MiniBatchKMeans for training an ML model. I need to get cluster ids of clusters and I tried with the below code. (Here model is the MiniBatchKmeans Clustering model) print("Cluster IDs: ", np.unique(model.labels_)) print("Number of Clusters: ", model.n_clusters) I got the following result. Cluster IDs: [0] Number of Clusters: 2 According to this result, it shows that there is only 1 cluster-id for the given dataset and still there are 2 clusters. I found that all

Convert Array[DenseVector] to CSV with Scala

╄→尐↘猪︶ㄣ 提交于 2019-12-24 21:06:21
问题 I am using Kmeans Spark function with Scala and I need to save the Cluster Centers obtained into a CSV. This val is type: Array[DenseVector] . val clusters = KMeans.train(parsedData, numClusters, numIterations) val centers = clusters.clusterCenters I was trying converting centers to a RDD file and then from RDD to DF, but I get a lot of problems (e.g, import spark.implicits._ / SQLContext.implicits._ is not working and I cannot use .toDF ). I was wondering if there is another way to make a

Is sklearn.cluster.KMeans sensative to data point order?

若如初见. 提交于 2019-12-24 19:30:03
问题 As noted in the answer to this post about feature scaling, some(all?) implementations of KMeans are sensitive to the order of features data points. Based on the sklearn.cluster.KMeans documentation, n_init only changes the initial position of the centroid. This would mean that one must loop over a few shuffles of features data points to test if this is a problem. My questions are as follows: Is the scikit-learn implementation sensitive to the ordering as the post suggest? Does n_init take

How to save cluster assignments in output file using Weka clustering XMeans?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-24 16:04:37
问题 Context I want to use Weka clustering algorithm XMeans . However I cannot figure out how to obtain cluster assignments from GUI of Weka . At the moment I can only see a list of cluster IDs along with percentage of entries assigned to each cluster. Question There any way to save cluster assignments for each entry in, e.g. CSV format? 回答1: Do everything in the "Preprocess Panel". This is one way to do this: Load Data File. Remove any Classification Attribute or Identifiers Choose Preprocess /

Creation prediction function for kmean in R

冷暖自知 提交于 2019-12-24 08:38:52
问题 I want create predict function which predicts for which cluster, observation belong data(iris) mydata=iris m=mydata[1:4] train=head(m,100) xNew=head(m,10) rownames(train)<-1:nrow(train) norm_eucl=function(train) train/apply(train,1,function(x)sum(x^2)^.5) m_norm=norm_eucl(train) result=kmeans(m_norm,3,30) predict.kmean <- function(cluster, newdata) { simMat <- m_norm(rbind(cluster, newdata), sel=(1:nrow(newdata)) + nrow(cluster))[1:nrow(cluster), ] unname(apply(simMat, 2, which.max)) } ##