httr

posting data using xml with R

孤街浪徒 提交于 2020-02-23 08:30:06
问题 I want to post xml with R, the code in python is import urllib2 url = 'http://www.rcsb.org/pdb/rest/search' queryText = """ <?xml version="1.0" encoding="UTF-8"?> <orgPdbQuery> <version>B0907</version> <queryType>org.pdb.query.simple.ExpTypeQuery</queryType> <description>Experimental Method Search : Experimental Method=SOLID-STATE NMR</description> <mvStructure.expMethod.value>SOLID-STATE NMR</mvStructure.expMethod.value> </orgPdbQuery> """ print "query:\n", queryText print "querying PDB...\n

Web-Scraping with Login and Redirect using R and rvest/httr

江枫思渺然 提交于 2020-02-23 05:44:11
问题 I would like to scrape information from a webpage. There is a login screen, and when I am logged in, I can access all kinds off pages from which I would like to scrape information (such as the last name of a player, the object .lastName ). I am using R and the packages rvest and httr . Somehow, the login seems to work, but I am clueless how to be redirected to the page I need to get the info from. The login form can be accessed on http://kickbase.sky.de/anmelden and the relevant pages have

How to convert an HTML R object to character?

谁说胖子不能爱 提交于 2020-01-23 02:36:12
问题 Here's my reproducible example: library(rvest) page <- html("http://google.com") class(page) page > as.character(page) Error in as.vector(x, "character") : cannot coerce type 'externalptr' to vector of type 'character' How can I convert page from an html class to a character vector so I can store it somewhere? The html functions like html_text or html_attr don't give me the whole source. I would like to store it so I can later re-load it with html(). Thanks. 回答1: To save directly to a text

Handling error response to empty webpage from read_html

偶尔善良 提交于 2020-01-13 05:32:10
问题 Trying to scrape a web page title but running into a problem with a website called "tweg.com" library(httr) library(rvest) page.url <- "tweg.com" page.get <- GET(page.url) # from httr pg <- read_html(page.get) # from rvest page.title <- html_nodes(pg, "title") %>% html_text() # from rvest read_html stops with an error message: "Error: Failed to parse text". Looking into page.get$content, find that it is empty (raw(0)). Certainly, can write a simple check to take this into account and avoid

r language support for AWS DynamoDB [duplicate]

删除回忆录丶 提交于 2020-01-12 05:43:49
问题 This question already has answers here : AWS dynamodb support for “R” programming language (3 answers) Closed 2 years ago . This is a follow up / updated question to this: AWS dynamodb support for "R" programming language I am looking for examples or documentation on how to read in a table from DynamoDB into R. This question pointed me in the right direction: R + httr and EC2 api authentication issues (answered by the great @hadley himself!). It's ok if I have to use httr and then parse a

Scrape “aspx” page with R

你。 提交于 2020-01-06 01:58:07
问题 can someone help me or give me some suggestion how scrape table from this url: https://www.promet.si/portal/sl/stevci-prometa.aspx. I tried with instructions and packages rvest, httr and html but for this particular site without any sucess. Thank you. 回答1: This ought to help get you started: library(RSelenium) library(wdman) library(seleniumPipes) library(rvest) library(tidyverse) selServ <- selenium(verbose = FALSE) selServ$log() # find the port remDr <- remoteDr(browserName = "chrome", port

identify the correct CSS selector of a url for an R script

会有一股神秘感。 提交于 2020-01-05 17:47:11
问题 I am trying to obtain data from a website and thanks to a helper i could get to the following script: require(httr) require(rvest) res <- httr::POST(url = "http://apps.kew.org/wcsp/advsearch.do", body = list(page = "advancedSearch", AttachmentExist = "", family = "", placeOfPub = "", genus = "Arctodupontia", yearPublished = "", species ="scleroclada", author = "", infraRank = "", infraEpithet = "", selectedLevel = "cont"), encode = "form") pg <- content(res, as="parsed") lnks <- html_attr

authentication to github private repositories with httr

一曲冷凌霜 提交于 2020-01-03 09:08:33
问题 I am trying to access a private repository on Github using httr . I am able to do so with no problem if I add my github token (stored as an environment variable in GITHUB_TOKEN ): httr::GET("https://api.github.com/repos/aammd/miniature-meme/releases/assets/2859674", httr::write_disk("test.rds", overwrite = TRUE), httr::progress("down"), httr::add_headers(Authorization = paste("token", Sys.getenv("GITHUB_TOKEN")))) However, if I try to specify another header, I get an error. In this case, I

415 code using httr and RCurl, but not just curl

僤鯓⒐⒋嵵緔 提交于 2020-01-02 10:22:34
问题 I'm trying to write a function that handles some of the authentication for Spotify's API. I can get it to work with a fairly simple curl command, but when I try to use httr or RCurl, I get 415 Unsupported Media Type responses. I'm somewhat at a loss at this point. I've gotten POST() , and GET() to work with this API already, but this endpoint is not working. Using httr : response <- POST('https://accounts.spotify.com/api/token', accept_json(), add_headers('Authorization'=paste('Basic',base64

Yahoo login using rvest

帅比萌擦擦* 提交于 2020-01-01 07:07:14
问题 Recently, Yahoo changed their authentication mechanism to a two step one. So now, when I login to a yahoo site, I put in my username, and then it asks me to open my yahoo mobile app to give it a code. Alternatively, you can have it email or text you some other way around this. The result of this is that code that used to work to programatically login to Yahoo sites no longer works. This code just redirects to the login form. I've tried with and without a useragent string and with and without