Assign new data point to cluster in kernel k-means (kernlab package in R)?

半城伤御伤魂 提交于 2019-11-28 07:41:01

Kernel K-means uses the Kernel function to calculate similarity of objects. In the simple k-means you loop through all centroids and select the one which minimizes the distance (under used metric) to the given data point. In case of kernel method (default kernel function in kkmeans is radial basis function), you simply loop through centroids and select the one that maximizes the kernel function value (in case of RBF) or minimizes the kernel induced distance (for any kernel). Detailed description of converting kernel to distance measure is provided here - in general distance induced by kernel K can be calculated through d^2(a,b) = K(a,a)+K(b,b)-2K(a,b), but as in case of RBF, K(x,x)=1 for all x, you can just maximize the K(a,b) instead of minimizing the whole K(a,a)+K(b,b)-2K(a,b).

To get the kernel function from kkmeans object you can use kernelf function

> data(iris)
> sc <- kkmeans(as.matrix(iris[,-5]), centers=3)
> K = kernelf(sc)

So for your example

> c=centers(sc)
> x=c(5.0, 3.6, 1.2, 0.4)
> K(x,c[1,])
             [,1]
[1,] 1.303795e-11
> K(x,c[2,])
             [,1]
[1,] 8.038534e-06
> K(x,c[3,])
          [,1]
[1,] 0.8132268
> which.max( c( K(x,c[1,]), K(x,c[2,]), K(x,c[3,]) ) )
[1] 3

the closest centroid is c[3,]=5.032692 3.401923 1.598077 0.3115385 in the sense of used kernel function.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!