Posting data to web forms using submit buttons from R

人走茶凉 提交于 2020-01-13 07:23:08

问题


I would like to post data to a web form from R, and retrieve the result. Is this possible at all?

In particular, I would like to pass a text file to this website http://ionspectra.org/aristo/batchmode/ and retrieve the result.

The post method the website uses is

<form action="../batchreport/" method="post" enctype="multipart/form-data"><div style='display:none'><input type='hidden' name='csrfmiddlewaretoken' value='d9c49e206913e5d8b515bc9705ca2e09' />

First I would like to check the radio button "format" to Tab-delimited:

<input type="radio" name="format" value="tsv" /> Tab-delimited <br/>

Then I would like to upload a given file:

<input type="file" name="batchfile" size="20"><br/>

Then have the submit button clicked:

<input type="submit" value="Ontologize!" />

And finally have the resulting text file be retrieved.

Question is, can this be scripted from R, and if so, using what package? Can it be done using RCurl's postForm perhaps? But if so, what would be the syntax in this case?

Any advice welcome!

cheers, Tom


回答1:


This is a little trickier than normal since it's a Django website, and we need to deal with Django's Cross-site Request Forgery protection by generating a CSRF token.

Here's how to do it with httr, using the example file provided here:

library(httr)
csrf <- GET(url='http://ionspectra.org/aristo/batchmode/')$cookies$csrftoken
res <- POST(url='http://ionspectra.org/aristo/batchreport/', 
            body=list(batchfile=upload_file('example.txt'),
                      format='tsv',
                      csrfmiddlewaretoken=csrf))
out <- read.delim(file=textConnection(content(res)), 
                  stringsAsFactors=FALSE)

The GET call generates the CSRF token, which is needed for the subsequent POST call.




回答2:


This also works and does not require the GET request. Basically, the ionspectra website plants a cookie when you access the webform, and sends that cookie back to the server in a hidden variable when you SUBMIT. Then the server compares the two. The code below spoofs the cookie using set_cookies(csrftoken=...), to be the same as csrfmiddlewaretoken in the body of the POST. As you can see, the token can be just about anything.

library(httr)
# download example dataset and save as file "example.txt"
data <- readLines("http://ionspectra.org/static/aristo/example.txt")
file <- writeLines(data,"example.txt")
# POST request; z$content is the returned content, in raw format
z <- POST(url="http://ionspectra.org/aristo/batchreport/",
       set_cookies(csrftoken="arbitrarytoken"),
       body=list(csrfmiddlewaretoken="arbitrarytoken",
                 format="tsv",
                 filter="filter",
                 submit="Ontologize!",
                 batchfile=upload_file("example.txt")))

df <- read.csv(text=rawToChar(z$content),header=T,sep="\t")
head(df)
#   scan.       title score    ChEBI_ID         ChEBI_Name  N   AUC Est..Precision Correct.
# 1     8 CHEBI:34205 0.781 CHEBI:53156 polychlorobiphenyl 24 0.990              1     True
# 2     8 CHEBI:34205 0.755 CHEBI:35446     chlorobiphenyl 27 0.990              1     True
# 3     8 CHEBI:34205 0.708 CHEBI:22888          biphenyls 38 0.943              1     True
# 4     8 CHEBI:34205 0.698 CHEBI:36686        chloroarene 40 0.966              1     True
# 5     8 CHEBI:34205 0.694 CHEBI:36820      ring assembly 49 0.827              1     True
# 6     8 CHEBI:34205 0.681 CHEBI:50887          haloarene 44 0.955              1     True


来源:https://stackoverflow.com/questions/21416542/posting-data-to-web-forms-using-submit-buttons-from-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!