问题
I'm struggling with the following question for the past half a day and although I've found some info about similar problems, nothing really hits the spot.
I'm trying to send a PUT request using urllib2 with data that contains some Unicode characters:
body = u'{ "bbb" : "asdf\xd7\xa9\xd7\x93\xd7\x92"}'
conn = urllib2.Request(request_url, body, headers)
conn.get_method = lambda: 'PUT'
response = urllib2.urlopen(conn)
I've tried to use body = body.encode('utf-8')
and other variations, but whatever I do I get the following error:
UnicodeEncodeError at ...
'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)
With one of the following call stacks:
File "..." in ...
195. response = urllib2.urlopen(conn)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in urlopen
126. return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in open
394. response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _open
412. '_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _call_chain
372. result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in http_open
1199. return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in do_open
1168. h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in request
955. self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_request
989. self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in endheaders
951. self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_output
815. self.send(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in send
787. self.sock.sendall(data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py" in meth
224. return getattr(self._sock,name)(*args)
Or the following call stack (for when I do body = body.encode('utf-8')
):
File "..." in ...
195. response = urllib2.urlopen(conn)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in urlopen
126. return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in open
394. response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _open
412. '_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _call_chain
372. result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in http_open
1199. return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in do_open
1168. h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in request
955. self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_request
989. self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in endheaders
951. self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_output
809. msg += message_body
What am I doing wrong? How can I send a body with Unicode characters via urllib2? If there are no Unicode characters, everything works fine.
Also note that my Content-Type
header is set to application/json;charset=utf-8
.
If it's relevant in any way, the context of what I'm doing is this: I'm getting a request to my Django server, and I delegate the request to another Django server. I don't redirect, just send the request from my own server get the response and send it back. So body
is the request.body
in the Django view.
Edit:
My headers are:
{
'Origin': 'http://10.0.0.146:8000',
'Accept-Language': 'en-US,en;q=0.8',
'Accept-Encoding': 'gzip,deflate,sdch',
'Host': 'localhost:5000',
'Accept': 'application/json, text/plain, */*',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.65 Safari/537.31',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Connection': 'keep-alive',
'X-Requested-With': 'XMLHttpRequest',
'Pragma': 'no-cache',
'Cache-Control': 'no-cache',
'Referer': 'http://localhost:5000/',
'Content-Type': 'application/json;charset=UTF-8',
'Authorization': 'ApiKey ogkLPgSESNyTOgIdbSLDhJjvyVJcbg:0d5897b5204c2f2527f532c6a97ba18a7f06acdc',
'Cookie': 'username=ogkLPgSESNyTOgIdbSLDhJjvyVJcbg; _we_wk_ls_=%7B%22time%22%3A1369123506709%7D; __jwpusr=39e63770-ec5c-4b96-9f7f-b199703d0d36; sessionid=0d741a7560258b301979a1c853b42a81; api_key=0d5897b5204c2f2527f532c6a97ba18a7f06acdc'
}
回答1:
You need to pass only byte strings to Request
. This applies to the headers, the url and the body.
If any of those three inputs contain Unicode values, automatic conversions between Unicode and strings will take place when concatenating, which will invariably lead to grief.
来源:https://stackoverflow.com/questions/16670140/how-to-send-utf-8-content-in-a-urllib2-request