Downloading a file from the internet using R is easy and has been addressed previously.
My question regards how to get past a popup message that seems to prevent my download from executing. Specifically,
download.file(url = "https://www.chicagofed.org/applications/bhc_data/bhcdata_index.cfm?DYR=2012&DQIR=4", destfile = "data/test.zip")
gives me a little file of garbage instead of the desired 18 megabyte file that you would get if you went to the website and entered the year 2012
and the quarter 4
manually. I suspect that the issue is that, as can be seen when you do it manually, a popup window interrupts the download process, asking whether to save the file or open it. Is there any way to get past the popup automatically (i.e., via download.file
)?
This can be done with Selenium see https://github.com/ropensci/RSelenium.
require(wdman)
require(RSelenium)
selPort <- 4444L
fprof <- makeFirefoxProfile(list(browser.download.dir = "C:\\temp"
, browser.download.folderList = 2L
, browser.download.manager.showWhenStarting = FALSE
, browser.helperApps.neverAsk.saveToDisk = "application/zip"))
selServ <- selenium(port = selPort)
remDr <- remoteDriver(extraCapabilities = fprof, port = selPort)
remDr$open(silent = TRUE)
remDr$navigate("https://www.chicagofed.org/applications/bhc_data/bhcdata_index.cfm")
# click year 2012
webElem <- remDr$findElement("name", "SelectedYear")
webElems <- webElem$findChildElements("css selector", "option")
webElems[[which(sapply(webElems, function(x){x$getElementText()}) == "2012" )]]$clickElement()
# click required quarter
webElem <- remDr$findElement("name", "SelectedQuarter")
Sys.sleep(1)
webElems <- webElem$findChildElements("css selector", "option")
webElems[[which(sapply(webElems, function(x){x$getElementText()}) == "4th Quarter" )]]$clickElement()
# click button
webElem <- remDr$findElement("id", "downloadDataFile")
webElem$clickElement()
Please install the firebug add-on into your firefox and see what happens when you vistit and configure the web request. IMO is the request for 2013 1st Quarter is mutch more complex, and needs a detailed analysis. It uses cookies and starts some scripting actions...
来源:https://stackoverflow.com/questions/21944016/download-file-from-internet-via-r-despite-the-popup