python - HTTP Error 503 Service Unavailable

半城伤御伤魂 提交于 2021-01-27 18:28:47

问题


I am trying to scrape data from google and linkedin. Somehow it gave me this error:

*** httperror_seek_wrapper: HTTP Error 503: Service Unavailable

Can someone help advice how I solve this?


回答1:


Google is simply detecting your query as automated. You would need a captcha solver to get unlimited results. The following link might be helpful.

https://support.google.com/websearch/answer/86640?hl=en

Bypassing Captcha using an OCR Engine:

http://www.debasish.in/2012/01/bypass-captcha-using-python-and.html

Simple Approach:

An even simpler approach is to simply use sleep() a few times and to generate random queries. This way google will not spot that you are using an automated system. But the system is far slower ...

Error Handling:

To simply get remove the error message use try and except




回答2:


I encountered the same situation and tried using the sleep() function before every request to spread the requests a little. It looked like it was working fine but failed soon enough even with a delay of 2 seconds. What solved it finally was using:

with contextlib.closing(urllib.urlopen(urlToOpen)) as x:
    #do stuff with x. 

This I did because I thought opening too many requests keeps it open and had to closed. Nevertheless, it worked quite consistently with as less as 0.5s delay time.



来源:https://stackoverflow.com/questions/25344610/python-http-error-503-service-unavailable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!