urllib

How to unquote a urlencoded unicode string in python?

空扰寡人 提交于 2019-11-26 09:22:49
问题 I have a unicode string like \"Tanım\" which is encoded as \"Tan%u0131m\" somehow. How can i convert this encoded string back to original unicode. Apparently urllib.unquote does not support unicode. 回答1: %uXXXX is a non-standard encoding scheme that has been rejected by the w3c, despite the fact that an implementation continues to live on in JavaScript land. The more common technique seems to be to UTF-8 encode the string and then % escape the resulting bytes using %XX. This scheme is

Python: Get HTTP headers from urllib2.urlopen call?

て烟熏妆下的殇ゞ 提交于 2019-11-26 07:07:14
问题 Does urllib2 fetch the whole page when a urlopen call is made? I\'d like to just read the HTTP response header without getting the page. It looks like urllib2 opens the HTTP connection and then subsequently gets the actual HTML page... or does it just start buffering the page with the urlopen call? import urllib2 myurl = \'http://www.kidsidebyside.org/2009/05/come-and-draw-the-circle-of-unity-with-us/\' page = urllib2.urlopen(myurl) // open connection, get headers html = page.readlines() //

Handling urllib2's timeout? - Python

点点圈 提交于 2019-11-26 06:20:00
问题 I\'m using the timeout parameter within the urllib2\'s urlopen. urllib2.urlopen(\'http://www.example.org\', timeout=1) How do I tell Python that if the timeout expires a custom error should be raised? Any ideas? 回答1: There are very few cases where you want to use except: . Doing this captures any exception, which can be hard to debug, and it captures exceptions including SystemExit and KeyboardInterupt , which can make your program annoying to use.. At the very simplest, you would catch

Python URLLib / URLLib2 POST

隐身守侯 提交于 2019-11-26 05:58:13
问题 I\'m trying to create a super-simplistic Virtual In / Out Board using wx/Python. I\'ve got the following code in place for one of my requests to the server where I\'ll be storing the data: data = urllib.urlencode({\'q\': \'Status\'}) u = urllib2.urlopen(\'http://myserver/inout-tracker\', data) for line in u.readlines(): print line Nothing special going on there. The problem I\'m having is that, based on how I read the docs, this should perform a Post Request because I\'ve provided the data

How to retrieve the values of dynamic html content using Python

浪尽此生 提交于 2019-11-26 04:25:55
问题 I\'m using Python 3 and I\'m trying to retrieve data from a website. However, this data is dynamically loaded and the code I have right now doesn\'t work: url = eveCentralBaseURL + str(mineral) print(\"URL : %s\" % url); response = request.urlopen(url) data = str(response.read(10000)) data = data.replace(\"\\\\n\", \"\\n\") print(data) Where I\'m trying to find a particular value, I\'m finding a template instead e.g.\"{{formatPrice median}}\" instead of \"4.48\". How can I make it so that I

AttributeError: 'module' object has no attribute 'urlopen'

梦想的初衷 提交于 2019-11-26 03:49:57
问题 I\'m trying to use Python to download the HTML source code of a website but I\'m receiving this error. Traceback (most recent call last): File \"C:\\Users\\Sergio.Tapia\\Documents\\NetBeansProjects\\DICParser\\src\\WebDownload.py\", line 3, in <module> file = urllib.urlopen(\"http://www.python.org\") AttributeError: \'module\' object has no attribute \'urlopen\' I\'m following the guide here: http://www.boddie.org.uk/python/HTML.html import urllib file = urllib.urlopen(\"http://www.python.org

Django: add image in an ImageField from image url

南笙酒味 提交于 2019-11-26 03:48:01
问题 please excuse me for my ugly english ;-) Imagine this very simple model : class Photo(models.Model): image = models.ImageField(\'Label\', upload_to=\'path/\') I would like to create a Photo from an image URL (i.e., not by hand in the django admin site). I think that I need to do something like this : from myapp.models import Photo import urllib img_url = \'http://www.site.com/image.jpg\' img = urllib.urlopen(img_url) # Here I need to retrieve the image (as the same way that if I put it in an

Python: URLError: <urlopen error [Errno 10060]

对着背影说爱祢 提交于 2019-11-26 03:44:03
问题 OS: Windows 7; Python 2.7.3 using the Python GUI Shell I\'m trying to read a website through Python, and several authors use the urllib and urllib2 libraries. To store the site in a variable, I\'ve seen a similar approach proposed: import urllib import urllib2 g = \"http://www.google.com/\" read = urllib2.urlopen(g) The last line generates an error after a 120+ seconds: > Traceback (most recent call last): File \"<pyshell#27>\", line 1, in > <module> > r = urllib2.urlopen(o) File \"C:\

should I call close() after urllib.urlopen()?

跟風遠走 提交于 2019-11-26 03:38:20
问题 I\'m new to Python and reading someone else\'s code: should urllib.urlopen() be followed by urllib.close() ? Otherwise, one would leak connections, correct? 回答1: The close method must be called on the result of urllib.urlopen , not on the urllib module itself as you're thinking about (as you mention urllib.close -- which doesn't exist). The best approach: instead of x = urllib.urlopen(u) etc, use: import contextlib with contextlib.closing(urllib.urlopen(u)) as x: ...use x at will here... The

urllib2.HTTPError: HTTP Error 403: Forbidden

拟墨画扇 提交于 2019-11-26 03:15:54
问题 I am trying to automate download of historic stock data using python. The URL I am trying to open responds with a CSV file, but I am unable to open using urllib2. I have tried changing user agent as specified in few questions earlier, I even tried to accept response cookies, with no luck. Can you please help. Note: The same method works for yahoo Finance. Code: import urllib2,cookielib site= \"http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/getHistoricalData.jsp?symbol