rcurl

Scraping password protected forum in r

☆樱花仙子☆ 提交于 2019-11-29 01:38:50
问题 I have a problem with logging in in my script. Despite all other good answers that I found on stackoverflow, none of the solutions worked for me. I am scraping a web forum for my PhD research, its URL is http://forum.axishistory.com. The webpage I want to scrape is the memberlist - a page that lists the links to all member profiles. One can only access the memberlist if logged in. If you try to access the memberlist without logging in, it shows you the log in form. The URL of the memberlist

How to URL Encode a Backslash with R/RCurl

半城伤御伤魂 提交于 2019-11-28 10:29:07
问题 I'm currently trying to encode a string for insertion into a URL. My issue is that this seems to fail when my string contains a backslash. I've tried 4 approaches so far using the URLencode, curlEscape (from RCurl), and curlPercentEncode (from RCurl) functions, but none of them have been successful. > URLencode("hello\hello") Error: '\h' is an unrecognized escape in character string starting ""hello\h" > curlEscape("hello\hello") Error: '\h' is an unrecognized escape in character string

Attempting to download files from SFTP using R

牧云@^-^@ 提交于 2019-11-28 09:54:33
问题 I'm trying to implement R in the workplace and save a bit of time from all the data churning we do. A lot of files we receive are sent to us via SFTP as they contain sensitive information. I've looked around on StackOverflow & Google but nothing seems to work for me. I tried using the RCurl Library from an example I found online but it doesn't allow me to include the port(22) as part of the login details. library(RCurl) protocol <- "sftp" server <- "hostname" userpwd <- "user:password"

Oauth with Twitter Streaming API in R (using RCurl)

烂漫一生 提交于 2019-11-28 07:06:09
I would like to connect to Twitter's Streaming API using RCurl in R, and also be able to filter keywords. However, new restrictions on authorization in Twitter API v1.1 is making using RCurl difficult. Before, code could go something like this taken from this page : getURL("https://stream.twitter.com/1/statuses/filter.json", userpwd="Username:Password", cainfo = "cacert.pem", write=my.function, postfields="track=bruins") But now, Twitter's new API is making users authorize with OAuth. I have a token and secret, I just need to place it in this code for authorization. Thanks! Simon O'Hanlon You

How can I screenshot a website using R?

心已入冬 提交于 2019-11-28 07:01:00
So I'm not 100% sure this is possible, but I found a good solution in Ruby and in python , so I was wondering if something similar might work in R. Basically, given a URL, I want to render that URL, take a screenshot of the rendering as a .png, and save the screenshot to a specified folder. I'd like to do all of this on a headless linux server. Is my best solution here going to be running system calls to a tool like CutyCapt , or does there exist an R-based toolset that will help me solve this problem? You can take screenshots using Selenium: library(RSelenium) rD <- rsDriver(browser =

using Rcurl with HTTPs

五迷三道 提交于 2019-11-28 06:35:44
I tried the following code in R on windows: library(RCurl) postForm("https://www.google.com/accounts/ClientLogin/", "email" = "me@gmail.com", "Passwd" = "abcd", "service" = "finance", "source" = "Test-1" ) but go the following error: Error in postForm() SL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed How do I setup RCurl to allow use of HTTPs? Mischa Vreeburg You need to install a SSL library. For windows you can get one here: Download "OpenSSL for Windows" version 0.9.8k Unzip to a temporary

Scrape website with R by navigating doPostBack

随声附和 提交于 2019-11-28 04:32:15
问题 I want to extract a table periodicaly from below site. price list changes when clicked building block names(BLOK 16 A, BLOK 16 B, BLOK 16 C, ...) . URL doesn't change, page changes by trigering javascript:__doPostBack('ctl00$ContentPlaceHolder1$DataList2$ctl04$lnk_blok','') I've tried 3 ways after searching google and starckoverflow. what I've tried no 1: this doesn't triger doPostBack event. postForm( "http://www.kentkonut.com.tr/tr/modul/projeler/daire_fiyatlari.aspx?id=44", ctl00

SOAP request in R

扶醉桌前 提交于 2019-11-28 01:51:17
Does anyone know how to formulate following SOAP request with R? POST /API/v201010/AdvertiserService.asmx HTTP/1.1 Host: advertising.criteo.com Content-Type: text/xml; charset=utf-8 Content-Length: length SOAPAction: "https://advertising.criteo.com/API/v201010/clientLogin" <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <clientLogin xmlns="https://advertising.criteo.com/API/v201010"> <username>string</username> <password

RCurl: HTTP Authentication When Site Responds With HTTP 401 Code Without WWW-Authenticate

若如初见. 提交于 2019-11-27 23:24:24
问题 I'm implementing an R wrapper around PiCloud's REST API using the RCurl package to make HTTP(S) requests to the API server. The API uses Basic HTTP authentication to verify that users have sufficient permissions. The PiCloud documentation gives an example of using the api and authenticating with curl: $ curl -u 'key:secret_key' https://api.picloud.com/job/?jids=12 This works perfectly. Translating this to an equivalent RCurl's command: getURL("https://api.picloud.com/job/?jids=12", userpwd=

devtools::install_github() - Ignore SSL cert verification failure

杀马特。学长 韩版系。学妹 提交于 2019-11-27 19:26:36
I'm trying to get devtools::install_github() working behind my corporate proxy on Windows 7. So far I've had to do the following: > library(httr) > library(devtools) > set_config(use_proxy("123.123.123.123",8080)) > devtools::install_github("rstudio/ggvis") Installing github repo ggvis/master from rstudio Downloading master.zip from https://github.com/rstudio/ggvis/archive/master.zip Error in function (type, msg, asError = TRUE) : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed Apparently we have