I would like to read online data to R using download.file()
as shown below.
URL <- \"https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fs
Offering the curl package as an alternative that I found to be reliable when extracting large files from an online database. In a recent project, I had to download 120 files from an online database and found it to half the transfer times and to be much more reliable than download.file.
#install.packages("curl")
library(curl)
#install.packages("RCurl")
library(RCurl)
ptm <- proc.time()
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
x <- getURL(URL)
proc.time() - ptm
ptm
ptm1 <- proc.time()
curl_download(url =URL ,destfile="TEST.CSV",quiet=FALSE, mode="wb")
proc.time() - ptm1
ptm1
ptm2 <- proc.time()
y = download.file(URL, destfile = "./data/data.csv", method="curl")
proc.time() - ptm2
ptm2
In this case, rough timing on your URL showed no consistent difference in transfer times. In my application, using curl_download in a script to select and download 120 files from a website decreased my transfer times from 2000 seconds per file to 1000 seconds and increased the reliability from 50% to 2 failures in 120 files. The script is posted in my answer to a question I asked earlier, see .