Variation on “How to plot decision boundary of a k-nearest neighbor classifier from Elements of Statistical Learning?”

匿名 (未验证) 提交于 2019-12-03 01:23:02

问题:

This is a question related to https://stats.stackexchange.com/questions/21572/how-to-plot-decision-boundary-of-a-k-nearest-neighbor-classifier-from-elements-o

For completeness, here's the original example from that link:

library(ElemStatLearn) require(class) x 0.5, "coral", "cornflowerblue")) box() 

I've been playing with that example, and would like to try to make it work with three classes. I can change some values of g with something like

g[8:16] 

just to pretend that there are some samples which are from a third class. I can't make the plot work, though. I guess I need to change the lines that deal with the proportion of votes for winning class:

prob 

and also the levels on the contour:

contour(px1, px2, prob15, levels=0.5, labels="", xlab="", ylab="", main= "15-nearest neighbour", axes=FALSE) 

I am also not sure contour is the right tool for this. One alternative that works is to create a matrix of data that covers the region I'm interested, classify each point of this matrix and plot those with a large marker and different colors, similar to what is being done with the points(gd...) bit.

The final purpose is to be able to show different decision boundaries generated by different classifiers. Can someone point me to the right direction?

thanks Rafael

回答1:

Separating the main parts in the code will help outlining how to achieve this:

Test data with 3 classes

 train 

Test data covering a grid

 require(MASS)   test 

Classification for that grid

3 classes obviously

 require(class)  classif 

Data structure for plotting

 require(dplyr)   dataf 

Plot

 require(ggplot2)  ggplot(dataf) +     geom_point(aes(x=x, y=y, col=cls),                data = mutate(test, cls=classif),                size=1.2) +      geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),                  bins=2,                  data=dataf) +     geom_point(aes(x=x, y=y, col=cls),                size=3,                data=data.frame(x=train[,1], y=train[,2], cls=cl)) 

We can also be a little fancier and plot the probability of class membership as a indication of the "confidence".

 ggplot(dataf) +     geom_point(aes(x=x, y=y, col=cls, size=prob),                data = mutate(test, cls=classif)) +      scale_size(range=c(0.8, 2)) +     geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),                  bins=2,                  data=dataf) +     geom_point(aes(x=x, y=y, col=cls),                size=3,                data=data.frame(x=train[,1], y=train[,2], cls=cl)) +     geom_point(aes(x=x, y=y),                size=3, shape=1,                data=data.frame(x=train[,1], y=train[,2], cls=cl)) 



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!