Plotting large number of time series using ggplot. Is it possible to speed up?

旧街凉风 提交于 2019-12-01 05:05:55

Part of your question asks for a "better way to plot these data".

In that spirit, you seem to have two problems, First, you expect to plot >35,000 points along the x-axis, which, as some of the comments point out, will result in pixel overlap on anything but an extremely large, high resolution monitor. Second, and more important IMO, you are trying to plot 69 time series (stations) on the same plot. In this type of situation a heatmap might be a better approach.

library(data.table)
library(ggplot2)
library(reshape2)          # for melt(...)
library(RColorBrewer)      # for brewer.pal(...)
url <-  "http://dl.dropboxusercontent.com/s/bxioonfzqa4np6y/timeSeries.txt"
dt  <- fread(url)
dt[,Year:=year(as.Date(date))]

dt.melt  <- melt(dt[,-1,with=F],id="Year",variable.name="Station")
dt.agg   <- dt.melt[,list(y=sum(value)),by=list(Year,Station)]
dt.agg[,Station:=factor(Station,levels=rev(levels(Station)))]
ggplot(dt.agg,aes(x=Year,y=Station)) + 
  geom_tile(aes(fill=y)) +
  scale_fill_gradientn("Annual\nPrecip. [mm]",
                       colours=rev(brewer.pal(9,"Spectral")))+
  scale_x_continuous(expand=c(0,0))+
  coord_fixed()

Note the use of data.tables. Your dataset is fairly large (because of all the columns; 35,000 rows is not all that large). In this situation data.tables will speed up processing substantially, especially fread(...) which is much faster than the text import functions in base R.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!