Speed up plot() function for large dataset

前端 未结 5 1667
旧时难觅i
旧时难觅i 2020-11-29 07:15

I am using plot() for over 1 mln data points and it turns out to be very slow.

Is there any way to improve the speed including programming and hardware

5条回答
  •  一向
    一向 (楼主)
    2020-11-29 08:00

    This question was asked at a time when the reticulate package to run Python commands from R didn't yet exist.

    It's now possible to call the highly efficient matplotlib Python library to plot a large dataset.

    matplotlib setup in R is described here.

    Plotting 1 Million points with matplotlib takes around 1.5 seconds:

    library(reticulate)
    library(png)
    
    mpl <- import("matplotlib")
    mpl$use("Agg") # Stable non interactive back-end
    plt <- import("matplotlib.pyplot")
    mpl$rcParams['agg.path.chunksize'] = 0 # Disable error check on too many points
    
    # generate points cloud
    a <-rnorm(1E6,1,1)
    b <-rnorm(1E6,1,1)
    
    system.time({
      plt$figure()
      plt$plot(a,b,'.',markersize=1)
      # Save figure
      f <- tempfile(fileext='.png')
      plt$savefig(f)
      # Close figure
      plt$close(plt$gcf())
      # Show image
      img <- readPNG(f)
      grid::grid.raster(img)
      # Close temporary file
      unlink(f)
    })
    

    #>        User      System       Total 
    #>        1.29        0.15        1.49
    

    Created on 2020-07-26 by the reprex package (v0.3.0)

提交回复
热议问题