rselenium

Check if it's possible to scroll down with RSelenium

不羁岁月 提交于 2019-12-04 02:09:25
问题 I'm using RSelenium to automatically scroll down a social media website and save posts. Sometimes I get to the bottom of the webpage and no more posts can be loaded as no more data is available. I just want to be able to check if this is the case so I can stop trying to scroll. How can I tell if it's possible to continue scrolling in RSelenium? The code below illustrates what I'm trying to do - I think I just need help with the "if" statement. FYI there's a solution for doing this in Python

R How to make a for loop without knowing the length?

落爺英雄遲暮 提交于 2019-12-04 01:58:03
问题 Currently on this site I am scraping in the shot chart information. To scrape in the info I need to make a for loop for however many shots there are. I find the number of shots by clicking "Team Stats" and finding the number of field goal attempts. I would like to make the proper for loop without having to find out the number of shots. What I am currently doing: shotchart <- data.frame(shot=as.vector(0), class=as.vector(0), data_homeaway=as.vector(0), data_period=as.vector(0), player_id=as

R Change IP Address programmatically

纵饮孤独 提交于 2019-12-03 08:02:04
问题 Currently changing user_agent by passing different strings to the html_session() method. Is there also a way to change your IP address on a timer when scraping a website? 回答1: You can use a proxy (which changes your ip) via use_proxy as follows: html_session("you-url", use_proxy("proxy-ip", port)) For more details see: ?httr::use_proxy To check if it is working you can do the following: require(httr) content(GET("https://ifconfig.co/json"), "parsed") content(GET("https://ifconfig.co/json",

How to fill in an online form and get results back in R

本秂侑毒 提交于 2019-12-03 07:42:15
问题 Has anyone ever filled in a web form remotely from R? I'd like to do some archery statistics in R using my scores. There is a very handy webpage, that gives you the classification and handicaps http://www.archersmate.co.uk/, which I naturally would want to include in my stats sheet. Is it possible to fill this form in remotely and to get the results back to R??? Otherwise I would have to get all handicap tables and stick them into a database myself. UPDATE: We've narrowed the problem down to

Scraping data from TripAdvisor using R

流过昼夜 提交于 2019-12-03 02:59:08
I want to create a crawler that will scrape some data from Trip Advisor. Ideally, it will (a) identify the links to all locations to crawl, (b) collect links to all attractions in each location and (c) will collect the destination names, dates and ratings for all reviews. I'd like to focus on part (a) for now. Here is the website I'm starting off with: http://www.tripadvisor.co.nz/Tourism-g255104-New_Zealand-Vacations.html There is problem here: the link gives top 10 destinations to begin with, and if you then click on "See more popular destinations" it will expand the list. It appears as

R Change IP Address programmatically

感情迁移 提交于 2019-12-02 21:30:26
Currently changing user_agent by passing different strings to the html_session() method. Is there also a way to change your IP address on a timer when scraping a website? You can use a proxy (which changes your ip) via use_proxy as follows: html_session("you-url", use_proxy("proxy-ip", port)) For more details see: ?httr::use_proxy To check if it is working you can do the following: require(httr) content(GET("https://ifconfig.co/json"), "parsed") content(GET("https://ifconfig.co/json", use_proxy("138.201.63.123", 31288)), "parsed") The first call will return your IP. The second call should

running Rselenium with rsDriver

十年热恋 提交于 2019-12-02 11:43:37
问题 I'm trying to run RSelenium using the WDMAN package. library(RSelenium) library(wdman) rd <-rsDriver(verbose =TRUE, browser = 'phantomjs') This gives me an error: [1] "Connecting to remote server" Selenium message:org.openqa.selenium.os.CommandLine.find(Ljava/lang/String;)Ljava/lang/String; Error: Summary: UnknownError Detail: An unknown server-side error occurred while processing the command. class: java.lang.NoSuchMethodError Further Details: run errorDetails method I'm running LinuxMint 18

RSelenium: hangs in navigate to direct pdf download

旧街凉风 提交于 2019-12-02 04:23:20
Using RSelenium via Docker Toolbox for Windows with selenium/standalone-firefox-debug container - all working fine: docker run -d -v //c/test/://home/seluser/Downloads -p 4445:4444 -p 5901:5900 selenium/standalone-firefox-debug Have setup firefox profile to download pdf directly: fprof <- makeFirefoxProfile(list(browser.startup.homepage = "about:blank" , startup.homepage_override_url = "about:blank" , startup.homepage_welcome_url = "about:blank" , startup.homepage_welcome_url.additional = "about:blank" , browser.download.dir = "/home/seluser/Downloads" , browser.download.folderList = 2L ,

How can I scrape data from a website within a frame using R?

你。 提交于 2019-12-02 04:20:51
The following link contains the results of the marathon of Paris: http://www.schneiderelectricparismarathon.com/us/the-race/results/results-marathon . I want to scrape these results, but the information lies within a frame. I know the basics of scraping with Rvest and Rselenium, but I am clueless on how to retrieve the data within such a frame. To get an idea, one of the things I tried was: url = "http://www.schneiderelectricparismarathon.com/us/the-race/results/results-marathon" site = read_html(url) ParisResults = site %>% html_node("iframe") %>% html_table() ParisResults = as.data.frame

Log In To Website Using RSelenium & phantomjs in R, Multiple Instances Of Class exist

痞子三分冷 提交于 2019-12-01 21:51:27
问题 I am trying to log in to this page: https://www.optionslam.com/accounts/login/ using the code on this post as a starting point, Scrape password-protected website in R I have been able to populate the login fields, but cannot click on the log in button. If you look at the source of the page, the class of login is "red-button" <input type="submit" value="Log in" class="red-button"/> However, there is another form at the top of the page that also uses the same class, and the clickElement()