Download a file from HTTPS using download.file()

后端 未结 9 1588
庸人自扰
庸人自扰 2020-11-28 07:38

I would like to read online data to R using download.file() as shown below.

URL <- \"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fs         


        
相关标签:
9条回答
  • 2020-11-28 08:10

    You can set global options and try-

    options('download.file.method'='curl')
    download.file(URL, destfile = "./data/data.csv", method="auto")
    

    For issue refer to link- https://stat.ethz.ch/pipermail/bioconductor/2011-February/037723.html

    0 讨论(0)
  • 2020-11-28 08:12

    I've succeed with the following code:

    url = "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
    x = read.csv(file=url)
    

    Note that I've changed the protocol from https to http, since the first one doesn't seem to be supported in R.

    0 讨论(0)
  • 2020-11-28 08:20

    Offering the curl package as an alternative that I found to be reliable when extracting large files from an online database. In a recent project, I had to download 120 files from an online database and found it to half the transfer times and to be much more reliable than download.file.

    #install.packages("curl")
    library(curl)
    #install.packages("RCurl")
    library(RCurl)
    
    ptm <- proc.time()
    URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
    x <- getURL(URL)
    proc.time() - ptm
    ptm
    
    ptm1 <- proc.time()
    curl_download(url =URL ,destfile="TEST.CSV",quiet=FALSE, mode="wb")
    proc.time() - ptm1
    ptm1
    
    ptm2 <- proc.time()
    y = download.file(URL, destfile = "./data/data.csv", method="curl")
    proc.time() - ptm2
    ptm2
    

    In this case, rough timing on your URL showed no consistent difference in transfer times. In my application, using curl_download in a script to select and download 120 files from a website decreased my transfer times from 2000 seconds per file to 1000 seconds and increased the reliability from 50% to 2 failures in 120 files. The script is posted in my answer to a question I asked earlier, see .

    0 讨论(0)
提交回复
热议问题