Can't download data from Yahoo Finance using Quantmod in R

最后都变了- 提交于 2019-11-26 22:25:07

The price history csv URL's appear to have changed

Old https://chart.finance.yahoo.com/table.csv?s=AAPL&a=2&b=17&c=2017&d=3&e=17&f=2017&g=d&ignore=.csv

New: https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1492438581&period2=1495030581&interval=1d&events=history&crumb=XXXXXXX

The new version appends a "crumb" field which appears to reflect cookie information in the user's browser. It seems they are intentionally blocking automated downloads of price histories and forcing queries to provide information to validate cookies in a web browser

The fix is detailed at https://github.com/joshuaulrich/quantmod/issues/157

Essentialy

remotes::install_github("joshuaulrich/quantmod", ref="157_yahoo_502")
# or
devtools::install_github("joshuaulrich/quantmod", ref="157_yahoo_502")

Version 0.4-9 of quantmod fixes this issue, and is now available on CRAN.

flemingcra

I've always wondered why Yahoo was so nice as to provide data downloads and how screwed I would be if they stopped doing it. Fortunately, help is on the way courtesy Joshua Ulrich.

Superfluous as it now may be, I coded a fix that shows one approach to get around the download problem.

library(xts)
getSymbols.yahoo.fix <- function (symbol, 
                                  from       = "2007-01-01", 
                                  to         = Sys.Date(), 
                                  period     = c("daily","weekly","monthly"),
                                  envir      = globalenv(),
                                  crumb      = "YourCrumb",
                                  DLdir      = "~/Downloads/") { #1
     # build yahoo query
     query1    <- paste("https://query1.finance.yahoo.com/v7/finance/download/",symbol,"?",sep="")
     fromPosix <- as.numeric(as.POSIXlt(from))
     toPosix   <- as.numeric(as.POSIXlt(to))
     query2    <- paste("period1=", fromPosix, "&period2=", toPosix, sep = "")
     interval  <- switch(period[1], daily = "1d", weekly = "1wk", monthly = "1mo")
     query3    <- paste("&interval=", interval, "&events=history&crumb=", crumb, sep = "")
     yahooURL  <- paste(query1, query2, query3, sep = "")
     #' requires browser to be open
     utils::browseURL("https://www.google.com")
     #' run the query - downloads the security as a csv file
     #' DLdir defaults to download directory in browser preferences
     utils::browseURL(yahooURL)
     #' wait 500 msec for download to complete - mileage may vary
     Sys.sleep(time = 0.5)
     yahooCSV  <- paste(DLdir, symbol, ".csv", sep = "")
     yahooDF   <- utils::read.csv(yahooCSV, header = TRUE)
     #' ------- 
     #' if you get: Error in file(file, "rt") : cannot open the connection
     #' it's because the csv file has not completed downloading
     #' try increasing the time for Sys.sleep(time = x)
     #' ------- 
     #' delete the csv file
     file.remove(yahooCSV)
     # convert date as character to date format
     yahooDF$Date <- as.Date(yahooDF$Date)
     # convert to xts
     yahoo.xts    <- xts(yahooDF[,-1],order.by=yahooDF$Date)
     # assign the xts file to the specified environment
     # default is globalenv()
     assign(symbol, yahoo.xts, envir = as.environment(envir))
     print(symbol)
} #1

It works like this:

  • Go to https://finance.yahoo.com/quote/AAPL/history?p=AAPL
  • Right click on "download data" and copy the link
  • Copy the crumb after "&crumb=" and use it in the function call
  • Set DLdir to the default download directory in your browser preferences
  • Set envir = as.environment("yourEnvir") - defaults to globalenv()
  • After downloading, the csv file is removed from your download directory to avoid clutter
  • Note that this will leave an "untitled" window open in the browser
  • As a simple test: getSymbols.yahoo.fix("AAPL")
  • -

You can also use getSymbols.yahoo.fix with lapply to get a list of asset data

from       <- "2016-04-01"
to         <- Sys.Date()
period     <- "daily"
envir      <- globalenv()
crumb      <- "yourCrumb"
DLdir      <- "~/Downloads/"
assetList  <- c("AAPL", "ADBE", "AMAT")
lapply(assetList, getSymbols.yahoo.fix, from, to, envir = globalenv(), crumb = crumb, DLdir)}

Coded in RStudio on Mac OSX 10.11 using Safari as my default browser. It also appears to work with Chrome, but you will need to use the cookie crumb for Chrome. I use a cookie blocker but had to whitelist finance.yahoo.com to retain the cookie for future browser sessions.

getSymbols.yahoo.fix might be useful. qauantmod::getSymbols of necessity, has more code built in for options and exception-handling. I'm coding for personal work, so I often lift those pieces of code I need from package functions. I haven't benchmarked getSymbols.yahoo.fix because, of course, I don't have a working version of GetSymbol for comparison. Besides, I couldn't pass up the opportunity to enter my first stackoverflow answer.

I too am encountering this error. A user on mrexcel fourm (jonathanwang003) explains that the new URL uses Unix Timecoding for dates. The updated VBA code would look something like this:

qurl = "https://query1.finance.yahoo.com/v7/finance/download/" & Symbol
qurl = qurl & "?period1=" & (StartDate - DateSerial(1970, 1, 1)) * 86400 & _
       "&period2=" & (EndDate - DateSerial(1970, 1, 1)) * 86400 & _
       "&interval=1d&events=history&crumb=" & **Crumb**

QueryQuote:
With Sheets(Symbol).QueryTables.Add(Connection:="URL;" & qurl, Destination:=Sheets(Symbol).Range("a1"))
    .BackgroundQuery = True
    .TablesOnlyFromHTML = False
    .Refresh BackgroundQuery:=False
    .SaveData = True
End With

The missing piece here is how to retrieve the "Crumb" field that contains cookie information from the browser. Anyone have any ideas. I found this post, which may help: https://www.mrexcel.com/forum/excel-questions/1001259-when-using-querytables-what-posttext-syntax-click-button-webpage.html (look at last post by john_w).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!