Yahoo login using rvest

帅比萌擦擦* 提交于 2020-01-01 07:07:14

问题


Recently, Yahoo changed their authentication mechanism to a two step one. So now, when I login to a yahoo site, I put in my username, and then it asks me to open my yahoo mobile app to give it a code. Alternatively, you can have it email or text you some other way around this. The result of this is that code that used to work to programatically login to Yahoo sites no longer works. This code just redirects to the login form. I've tried with and without a useragent string and with and without the countrycode=1 in the form values. I'm fine with entering a code after looking at my mobile app, but it doesn't forward me to the page to enter that code. How do we login to Yahoo these days using R?

url <- "http://mail.yahoo.com"
uastring <- "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"

s <- rvest::html_session(url, httr::user_agent(uastring))
s_form <- rvest::html_form(s)[[1]]
filled_form <- rvest::set_values(s_form, username="myusername", 
                                 passwd="mypassword")
out <- rvest::submit_form(session=s, filled_form, submit="signin",
                          httr::add_headers("Content-Length"=0))

回答1:


Okay, I've stumbled upon the answer here. I was using the httr::add_headers("Content-Length"=0) in response to a warning that rvest would throw: Warning message: In request_POST(session, url = url, body = request$values, encode = request$encode, : Length Required (HTTP 411).

As it turns out, despite the warning, everything worked fine and in fact, if I add the content-length header, the login fails. So, my code to login to yahoo ends up looking like this:

  username <- "some_username@yahoo.com"
  league_id <- "some league id to complete the fantasy football url"

  uastring <- "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
  url <- "http://football.fantasysports.yahoo.com/f1/"
  url <- paste0(url, league_id)

  s <- rvest::html_session(url, httr::user_agent(uastring))  
  myform <- rvest::html_form(s)[[1]]
  myform <- rvest::set_values(myform, username=username)
  s <- suppressWarnings(rvest::submit_form(s, myform, submit="signin"))
  s <- rvest::jump_to(s, s$response$url)
  myform <- rvest::html_form(s)[[1]]
  if("code" %in% names(myform$fields)){
    code <- readline(prompt="In your Yahoo app, find and click on the Account Key icon.\nGet the 8 character code and\nenter it here: ")
  }else{
    print("Unable to login")
    return(NULL)
  }
  myform <- rvest::set_values(myform, code=code)  
  s <- suppressWarnings(rvest::submit_form(s, myform, submit="verify"))
  if(grepl("authorize\\/verify", s$url)){
    print("Wrong code entered, unable to login")
    return(NULL)
  }else{
    print("Login successful")
  }
  s <- rvest::jump_to(s, s$response$url)

It's a two step process... Submit your username, then go to your yahoo app to get the login code. There's no yahoo password needed. I use readline to get the login code. Seems to work well... I'm able to scrape my fantasy football data after completing the login. It's just very curious that the warning asking for a content length header would lead you down a path that doesn't work. By the way, this same situation applies when trying to login to google. You have to ignore the warning and it works fine.



来源:https://stackoverflow.com/questions/36269476/yahoo-login-using-rvest

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!