问题
How do I quickly visualize large matrices in R?
I sometimes work with large-ish numeric matrices (e.g. 3000 x 3000), and quickly visualizing them is a very helpful quality control step. This was very easy and fast in Matlab, my previous language of choice. For example, it takes 0.5 seconds to display a 1000x1000 matrix:
rand_matrix = rand(1000,1000);
tic
imagesc(rand_matrix)
toc
>> Elapsed time is 0.463903 seconds.
I'd like the same powers in R, but unfortunately visualizing matrices seems very slow in R. For example, using image.plot()
the same random matrix takes more than 10 seconds to display:
require(tictoc)
require(image.plot)
mm = 1000
nn = 1000
rand.matrix = matrix(runif(mm*nn), ncol=mm, nrow=nn)
tic("Visualizing matrix")
image.plot(rand.matrix)
toc()
> Visualizing matrix: 11.744 sec elapsed
The problem gets worse as the matrices get bigger. For example, a 3000x3000 matrix takes minutes to visualize in R, compared to seconds in Matlab. This obviously doesn't really work for data exploration. I've tried ggplot, and melting + geom_raster() can still take up to a minute.
What am I doing wrong? Is there a fast way to visualize matrices in R? An ideal solution would take one or two lines.
回答1:
I get a plot pretty quickly when using image(m, useRaster = TRUE)
:
start = Sys.time()
image(rand.matrix, useRaster = TRUE)
print(Sys.time() - start)
# Time difference of 0.326 secs
Without useRaster = TRUE
this takes 1.5 seconds, useRaster
speeds this up but only works for simple, evenly spaced points I think.
If your ultimate goal is to produce an image file with this plot, then I think it might be most efficient to output directly to a raster format like png, although it's a little tricky to measure exactly how long R is taking to save the image file, e.g.:
png("image_plot.png", width = 1000, height = 1000)
image(rand.matrix, useRaster = TRUE)
dev.off()
来源:https://stackoverflow.com/questions/50381307/data-exploration-in-r-display-heatmap-of-large-matrix-quickly