httr

How to scrape JavaScript rendered Website by R?

谁都会走 提交于 2021-02-18 17:48:07
问题 Just wanna ask if there is any good approach to scrape the website below? https://list.jd.com/list.html?cat=737,794,798&page=1&sort=sort_rank_asc&trans=1&JL=6_0_0#J_main Basically I want to get the name and price of all products However, the price info is stored in some JQuery scripts Is Selenium the only solution? Thought of using V8 / Jsonlite, but it seems that they are not applicable. It'd be great if you can offer some alternatives in R. (Access to exe files is blocked in my computer, I

httr POST authentication error

一笑奈何 提交于 2021-02-11 08:58:25
问题 I am trying to structure a POST json request using httr. The API documentation proposes the following the CURL request: curl -X POST -H "Authorization:Token XXXXXXXXX" -H "Content-Type: application/json" --data "{\"texts\":[\"A simple string\"]}" https://api.uclassify.com/v1/uclassify/topics/classify My R httr implementation is the following: POST("https://api.uclassify.com/v1/uClassify/Topics/classify", encode="json", add_headers('Authorization:Token'="XXXXXXXXX"), body=("A simple string"))

Cannot GET cookie?

人盡茶涼 提交于 2021-01-29 09:56:40
问题 If we visit this url in chrome, with devtools open, we can clearly see a cookie appear (in chrome developer tools -> 'application' -> 'cookies'). If we attempt the same thing using httr::GET() , we expect to see the cookie, but we do not: library(httr) r <- GET("https://aps.dac.gov.in/LUS/Public/Reports.aspx") r$cookies # [1] domain flag path secure expiration name value # <0 rows> (or 0-length row.names) Why is this, and how can we retrieve the cookie (along with the page html) preferably

Cannot install httr package in R 3.6.2 in Linux Mint 19.3

岁酱吖の 提交于 2021-01-28 21:17:47
问题 I am totally new to R. I tried to install the httr package. I first installed pacman, and then tried to load httr through it by running pacman::p_load(httr) . It wasn't successful. And it showed the following message in terminal- Installing package into ‘/home/|username|/R/x86_64-pc-linux-gnu-library/3.6’ (as ‘lib’ is unspecified) also installing the dependencies ‘curl’, ‘openssl’ trying URL 'https://cloud.r-project.org/src/contrib/curl_4.3.tar.gz' Content type 'application/x-gzip' length