Google play review scraping changes

依然范特西╮ 提交于 2019-12-10 12:18:05

问题


Over the past year or so I have created a number of scripts to scrape Android app reviews from Google Play. In the past this was working fine by mimicking the Google Play interface to call https://play.google.com/store/getreviews with the necessary parameters and parse the HTML results.

The recent updates to the Google Play interface changed the HTML structure, but also seems to implement some kind of protection against scraping. There is now a "token" parameter which changes, presumably some kind of session ID, and which I have not been able to generate as I'm not sure of what seeds it. Also I've found that it seems to block requesting clients that make multiple calls that don't conform to the interface, as after an unsuccessful call I can't even load the Google Play interface in any browser. After a while this seems to time out. Not certain of this, but it's what I've concluded from what I'm seeing.

Anyone found this similar problem, and found a way around it?

Thanks


回答1:


Give this a try: www.scrape4me.com

It does show an error but it outpouts content:

http://scrape4me.com/api?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.com2us.golfstarworldtour.normal.freefull.google.global.android.common&elm=&ch=ch


来源:https://stackoverflow.com/questions/18482660/google-play-review-scraping-changes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!