reading a stream made by urllib2 never recovers when connection got interrupted

丶灬走出姿态 提交于 2019-12-10 12:55:02

问题


While trying to make one of my python applications a bit more robust in case of connection interruptions I discovered that calling the read function of an http-stream made by urllib2 may block the script forever.

I thought that the read function will timeout and eventually raise an exception but this does not seam to be the case when the connection got interrupted during a read function call.

Here is the code that will cause the problem:

import urllib2

while True:
    try:
        stream = urllib2.urlopen('http://www.google.de/images/nav_logo4.png')
        while stream.read(): pass
        print "Done"
    except:
        print "Error"

(If you try out the script you probably need to interrupt the connection several times before you will reach the state from which the script never recovers)

I watched the script via Winpdb and made a screenshot of the state from which the script does never recover (even if the network got available again).

Winpdb http://img10.imageshack.us/img10/6716/urllib2.jpg

Is there a way to create a python script that will continue to work reliable even if the network connection got interrupted? (I would prefer to avoid doing this inside an extra thread.)


回答1:


Try something like:

import socket
socket.setdefaulttimeout(5.0)
   ...
try:
   ...
except socket.timeout:
   (it timed out, retry)



回答2:


Good question, I would be really interested in finding an answer. The only workaround I could think of is using the signal trick explained in python docs. In your case it will be more like:

import signal
import urllib2

def read(url):
    stream = urllib2.urlopen(url)
    return stream.read()

def handler(signum, frame):
    raise IOError("The page is taking too long to read")

# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)

# This read() may hang indefinitely
try:
    output = read('http://www.google.de/images/nav_logo4.png')
except IOError:
    # try to read again or print an error
    pass

signal.alarm(0)          # Disable the alarm


来源:https://stackoverflow.com/questions/811446/reading-a-stream-made-by-urllib2-never-recovers-when-connection-got-interrupted

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!