cluster-analysis | 易学教程

Performance Analysis of Clustering Algorithms

阅读更多关于 Performance Analysis of Clustering Algorithms

问题 I have been given 2 data sets and want to perform cluster analysis for the sets using KNIME. Once I have completed the clustering, I wish to carry out a performance comparison of 2 different clustering algorithms. With regard to performance analysis of clustering algorithms, would this be a measure of time (algorithm time complexity and the time taken to perform the clustering of the data etc) or the validity of the output of the clusters? (or both) Is there any other angle one look at to

Performance Analysis of Clustering Algorithms

阅读更多关于 Performance Analysis of Clustering Algorithms

Performance Analysis of Clustering Algorithms

阅读更多关于 Performance Analysis of Clustering Algorithms

Performance Analysis of Clustering Algorithms

阅读更多关于 Performance Analysis of Clustering Algorithms

Performance Analysis of Clustering Algorithms

阅读更多关于 Performance Analysis of Clustering Algorithms

Identifying points by color

阅读更多关于 Identifying points by color

问题 I am following the tutorial over here : https://www.rpubs.com/loveb/som . This tutorial shows how to use the Kohonen Network (also called SOM, a type of machine learning algorithm) on the iris data. I ran this code from the tutorial: library(kohonen) #fitting SOMs library(ggplot2) #plots library(GGally) #plots library(RColorBrewer) #colors, using predefined palettes iris_complete <-iris[complete.cases(iris),] iris_unique <- unique(iris_complete) # Remove duplicates #scale data iris.sc = scale

Identifying points by color

阅读更多关于 Identifying points by color

Factoextra: How to change color of the average silhouette width in the fviz_silhouette function?

阅读更多关于 Factoextra: How to change color of the average silhouette width in the fviz_silhouette function?

问题 I'm very curious about the ways to override the color value of the default red dashed line for average silhouette width in the fviz_silhouette function. Just peeked the fviz_silhouette code, and it puzzling me, why the author fixed line color parameter? (Listing from the function source code.) p <- ggplot(df, mapping) + geom_bar(stat = "identity") + labs(y = "Silhouette width Si", x = "", title = paste0("Clusters silhouette plot ", "\n Average silhouette width: ", round(mean(df$sil_width), 2)

Factoextra: How to change color of the average silhouette width in the fviz_silhouette function?

阅读更多关于 Factoextra: How to change color of the average silhouette width in the fviz_silhouette function?

Clustering multivariate time series - question regarding distance matrix

阅读更多关于 Clustering multivariate time series - question regarding distance matrix

问题 I am trying to cluster meteorological stations using R. Stations provide such data as temperature, wind speed, humidity and some more on hourly intervals. I can easily cluster univariate time series using tsclust library, but when I cluster multivariate series I get errors. I have data as a list so each list element is a matrix with time series data of one station (variables are columns and rows are different timestamp). If I run: tsclust(data, k = 2, distance = 'Euclidean', seed = 3247,