Creating 2D bins in R

廉价感情. 提交于 2019-12-03 09:03:35

You can get the binned data using the bin2 function in the ash library.

Regarding the problem of the sparsity of data in the region around the red point, one possible solution is with the average shifted histogram. It bins your data after shifting the histogram several times and averaging the bin counts. This alleviates the problem of the bin origin. e.g., imagine how the number of points in the bin containing the red point changes if the red point is the topleft of the bin or the bottom right of the bin.

library(ash)
bins <- bin2(cbind(x,y))
f <- ash2(bins, m = c(10,10))

image(f$x,f$y,f$z)
contour(f$x,f$y,f$z,add=TRUE)

If you would like smoother bins, you could try increasing the argument m, which is a vector of length 2 controlling the smoothing parameters along each variable.

f2 <- ash2(bins, m = c(10,10))
image(f2$x, f2$y, f2$z)
contour(f2$x,f2$y,f2$z,add=TRUE)

Compare f and f2

The binning algorithm is implemented in fortran and is very fast.

If you're willing to use ggplot2, there are some nice options.

ggplot(data.frame(x,y), aes(x,y)) + geom_bin2d()

ggplot(data.frame(x,y), aes(x,y)) + stat_density2d(aes(fill = ..level..), geom = "polygon")

Update: To calculate the 2d binning, you could use a 2d (bivariate) normal kernel density smoothing:

library(KernSmooth)
bins <- bkde2D(as.matrix(data.frame(x, y)), bandwidth = c(2, 2), gridsize = c(25L, 25L))

which can also be plotted as

library(reshape2)
ggplot(melt(bins$fhat), aes(Var1, Var2, fill = value)) + geom_raster()

The bins object contains the x and y values and normalised density fhat. Play with the gridsize (number of grid points in each direction) and bandwidth (smoothing scale) to get what you're after.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!