Can't open Unicode URL with Python

前端 未结 5 1438
慢半拍i
慢半拍i 2020-12-09 20:02

Using Python 2.5.2 and Linux Debian, I\'m trying to get the content from a Spanish URL that contains a Spanish char \'í\':

import urllib
url = u         


        
5条回答
  •  半阙折子戏
    2020-12-09 20:51

    Encoding the URL as utf-8, should have worked. I wonder if your source file is properly encoded, and whether the interpreter knows it. If your python source file is saved as UTF-8, for example, then you should have

    # coding=UTF-8
    

    as the first or second line.

    import urllib
    url = u'http://mydomain.es/índice.html'
    content = urllib.urlopen(url.encode('utf-8')).read()
    

    works for me.

    Edit: also, be aware that Unicode text in an interactive Python session (whether through IDLE, or a console) is fraught with encoding-related difficulty. In those cases, you should use Unicode literals (like \u00ED in your case).

提交回复
热议问题