How to make a script wait within an iteration until the Internet connection is reestablished?

孤者浪人 提交于 2019-12-09 23:43:44

问题


I have a scraping code within a for loop, but it would take several hours to complete, and the program stops when my Internet connection breaks. What I (think I) need is a condition at the beginning of the scraper that tells Python to keep trying at that point. I tried to use the answer from here:

for w in wordlist:

#some text processing, works fine, returns 'textresult'

    if textresult == '___':  #if there's nothing in the offline resources
        bufferlist = list()
        str1=str()
        mlist=list()  # I use these in scraping

        br = mechanize.Browser()

        tried=0
        while True:
            try:
                br.open("http://the_site_to_scrape/")

                # scraping, with several ifs. Each 'for w' iteration results with scrape_result string.


            except (mechanize.HTTPError, mechanize.URLError) as e:
                tried += 1
                if isinstance(e,mechanize.HTTPError):
                    print e.code
                else:
                    print e.reason.args
            if tried > 4:
                    exit()
                    time.sleep(120)
                    continue
            break

Works while I'm online. When the connection breaks, Python writes the 403 code and skips that word from wordlist, moves on to the next and does the same. How can I tell Python to wait for connection within the iteration?

EDIT: I would appreciate it if you could write at least some of the necessary commands and tell me where they should be placed in my code, because I've never dealt with exception loops.

EDIT - SOLUTION I applied Abhishek Jebaraj's modified solution. I just added a very simple exception handling command:

except:
    print "connection interrupted"
    time.sleep(30)

Also, Jebaraj's getcode command will raise an error. Before r.getcode, I used this:

import urllib

r = urllib.urlopen("http: the site ")

The top answer to this question helped me as well.


回答1:


Write another while loop inside which will keep trying to connect to the internet.

It will break only when it receives status code of 200 and then you can continue with your program.

Kind of like

retry = True
while retry:
    try:
        r = br.open(//your site)
        if r.getcode()/10==20:
            retry = False
    except:
          // code to handle any exception

// rest of your code


来源:https://stackoverflow.com/questions/42344303/how-to-make-a-script-wait-within-an-iteration-until-the-internet-connection-is-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!