trouble reaching a css node

雨燕双飞 提交于 2019-12-12 00:59:56

问题


from this page: http://www.beta.inegi.org.mx/app/buscador/default.html?q=e15a61a

i'm trying to retrieve this url: http://www.beta.inegi.org.mx/app/biblioteca/ficha.html?upc=702825720599,

I've tried to reach it through the css selector and through the xpath (copied with right-click in web developer tab), however, I only get an {xml_nodeset (0)]

library(rvest)
url <- "http://www.beta.inegi.org.mx/app/buscador/default.html?q=e15a62b"
url %>% html_node("#snippet_row-tag_a_0") 
url %>% html_node(xpath='//*[@id="snippet_row-tag_a_0"]')
|improve this question

回答1:


The items you want to scrape are rendered with JavaScript, you can use the hidden API instead:

Try this url:
http://www.beta.inegi.org.mx/app/api/buscador/busquedaTodos/E15A61A_A/RANKING/es

This will return you a JSON string, you can parse it into a list in R and extract the information you want.



来源:https://stackoverflow.com/questions/50997094/trouble-reaching-a-css-node

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!