k-means | 易学教程

Cannot handle any class attribute! kmeans java

阅读更多关于 Cannot handle any class attribute! kmeans java

问题 I want execute a k-means algorithm i use for this weka in eclipse i have this code public class demo { public demo() throws Exception { // TODO Auto-generated constructor stub BufferedReader breader = null; breader = new BufferedReader(new FileReader( "D:/logiciels/weka-3-7-12/weka-3-7-12/data/iris.arff")); Instances Train = new Instances(breader); Train.setClassIndex(Train.numAttributes() - 1); SimpleKMeans kMeans = new SimpleKMeans(); kMeans.setSeed(10); kMeans.setPreserveInstancesOrder

compute clustersize automatically for kmeans

阅读更多关于 compute clustersize automatically for kmeans

问题 I am using scikit-learn and experimenting Kmeans. Its fast but requires number of clusters as an argument. What i would like to try is to automatically computer number of clusters for based on population of documents. hash-based near-neighbor algorithms (ssdeep) i used before can get similarity clusters based on distance , how can i get cluster size automatically for k means . KMeans(init='k-means++', n_clusters=cluster_count, n_init=10), name="k-means++", data=data) I want to calculate that

opencv multidimensional kmeans

阅读更多关于 opencv multidimensional kmeans

问题 I'm trying to run the kmeans algorithm on a n-dimensional data. I Have N points and each point have { x, y, z, ... , n } features. my code is the following: cv::Mat points(N, n, CV_32F); // fill the data points cv::Mat labels; cv::Mat centers; cv::kmeans(points, k, labels, cv::TermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 1000, 0.001), 10, cv::KMEANS_PP_CENTERS, centers); the problem is that the kmeans algorithm run into a segmentation fault. any help is appreciated update How Miki and Micka

Find Jaccard distance of tweets and cluster in Kmeans

阅读更多关于 Find Jaccard distance of tweets and cluster in Kmeans

问题 This is a follow up question to a problem I've been working on for a while. I have two questions. One regards an algorithm that works on two tweets, that I revised to measure 10 tweets. I'm wondering what my revision is measuring. I get result, but I want it to measure several tweet's jaccard distances, not just return one value. Since it's returning one value, I think it's just adding everything up. The other question is about my attempt to create a For Loop and assign clusters. I'm trying

Drawbacks of K-Medoid (PAM) Algorithm

阅读更多关于 Drawbacks of K-Medoid (PAM) Algorithm

问题 I have researched that K-medoid Algorithm (PAM) is a parition-based clustering algorithm and a variant of K-means algorithm. It has solved the problems of K-means like producing empty clusters and the sensitivity to outliers/noise. However, the time complexity of K-medoid is O(n^2), unlike K-means (Lloyd's Algorithm) which has a time complexity of O(n). I would like to ask if there are other drawbacks of K-medoid algorithm aside from its time complexity. 回答1: The main disadvantage of K-Medoid

k-means for text clustering

阅读更多关于 k-means for text clustering

问题 I'm trying to implement k-means for text clustering, specifically English sentences. So far I'm at the point where I have a term frequency matrix for each document (sentence). I'm a little confused on the actual implementation of k-means on text data. Here's my guess of how it should work. Figure out the number of unique words in all sentences (a large number, call it n ). Create k n dimensional vectors (clusters) and fill in the values of the k vectors with some random numbers (how do I

How to replace the appropriate colors with my own pallette in MATLAB?

阅读更多关于 How to replace the appropriate colors with my own pallette in MATLAB?

问题 I am using MATLAB 2015. I want to reduce the image color count. An RGB image will be segmentated using k-means algorithm. Then mean colors will be replaced with the colors I have. The colors are (10), black - [255, 255, 255], yellow - [255, 255, 0], orange - [255, 128, 0], white - [255, 255, 255], pink - [255, 153, 255], lavender - [120, 102, 255], brown - [153, 51, 0], green - [0, 255, 0], blue - [0, 0, 255], red - [255, 0, 0]. I have succeeded clustering the image. Clustered images should

Weighting k Means Clustering by number of observations

阅读更多关于 Weighting k Means Clustering by number of observations

问题 I would like to cluster some data using k Means in R that looks as follows. ADP NS CNTR PP2V EML PP1V ADDPS FB PP1D ADR ISV PP2D ADSEM SUMALL CONV 2 0 0 1 0 0 0 0 0 12 0 12 0 53 0 2 0 0 1 0 0 0 0 0 14 0 25 0 53 0 2 0 0 1 0 0 0 0 0 15 0 0 0 53 0 2 0 0 1 0 0 0 0 0 15 0 4 0 53 0 2 0 0 1 0 0 0 0 0 17 0 0 0 53 0 2 0 0 1 0 0 0 0 0 18 0 0 0 106 0 2 0 0 1 0 0 0 0 0 23 0 10 0 53 0 2 0 0 1 0 0 1 0 0 0 0 1 0 106 0 2 0 0 1 0 0 3 0 0 0 0 0 0 53 0 2 0 0 2 0 0 0 0 0 0 0 0 0 3922 0 2 0 0 2 0 0 0 0 0 0 0 1 0

how to import logistic regression and kmeans pmml files into r

阅读更多关于 how to import logistic regression and kmeans pmml files into r

问题 I am looking for some guidance please on importing pmml model files into r. PMML is a predictive model markup language which allows models built in one system to be deployed in another. I have several models that have been trained on spss and saved to the xml format using pmml. They are Logistic Regression and k-means models. I have undertaken exhaustive searches for r capabilities to import pmml and am finding that there is only a rare function here and there in packages such as Arules for

matlab k-means clustering evaluation [duplicate]

阅读更多关于 matlab k-means clustering evaluation [duplicate]

问题 This question already has answers here : Evaluating K-means accuracy (2 answers) Closed last year . How effectively evaluate the performance of the standard matlab k-means implementation. For example I have a matrix X X = [1 2; 3 4; 2 5; 83 76; 97 89] For every point I have a gold standard clustering. Let's assume that (83,76), (97,89) is the first cluster and (1,2), (3,4), (2,5) is the second cluster. Then we run matlab idx = kmeans(X,2) And get the following results idx = [1; 1; 2; 2; 2]