Download a file from HTTPS using download.file()

后端 未结 9 1600
庸人自扰
庸人自扰 2020-11-28 07:38

I would like to read online data to R using download.file() as shown below.

URL <- \"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fs         


        
9条回答
  •  南方客
    南方客 (楼主)
    2020-11-28 08:20

    Offering the curl package as an alternative that I found to be reliable when extracting large files from an online database. In a recent project, I had to download 120 files from an online database and found it to half the transfer times and to be much more reliable than download.file.

    #install.packages("curl")
    library(curl)
    #install.packages("RCurl")
    library(RCurl)
    
    ptm <- proc.time()
    URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
    x <- getURL(URL)
    proc.time() - ptm
    ptm
    
    ptm1 <- proc.time()
    curl_download(url =URL ,destfile="TEST.CSV",quiet=FALSE, mode="wb")
    proc.time() - ptm1
    ptm1
    
    ptm2 <- proc.time()
    y = download.file(URL, destfile = "./data/data.csv", method="curl")
    proc.time() - ptm2
    ptm2
    

    In this case, rough timing on your URL showed no consistent difference in transfer times. In my application, using curl_download in a script to select and download 120 files from a website decreased my transfer times from 2000 seconds per file to 1000 seconds and increased the reliability from 50% to 2 failures in 120 files. The script is posted in my answer to a question I asked earlier, see .

提交回复
热议问题