问题
Update :
I found the answer here : Python UnicodeDecodeError - Am I misunderstanding encode?
I needed to explicitly decode my incoming file into Unicode when I read it. Because it had characters that were neither acceptable ascii nor unicode. So the encode was failing when it hit these characters.
Original Question
So, I know there's something I'm just not getting here.
I have an array of unicode strings, some of which contain non-Ascii characters.
I want to encode that as json with
json.dumps(myList)
It throws an error
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 13: ordinal not in range(128)
How am I supposed to do this? I've tried setting the ensure_ascii parameter to both True and False, but neither fixes this problem.
I know I'm passing unicode strings to json.dumps. I understand that a json string is meant to be unicode. Why isn't it just sorting this out for me?
What am I doing wrong?
Update : Don Question sensibly suggests I provide a stack-trace. Here it is. :
Traceback (most recent call last):
File "importFiles.py", line 69, in <module>
x = u"%s" % conv
File "importFiles.py", line 62, in __str__
return self.page.__str__()
File "importFiles.py", line 37, in __str__
return json.dumps(self.page(),ensure_ascii=False)
File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 204, in encode
return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 17: ordinal not in range(128)
Note it's python 2.7, and the error is still occurring with ensure_ascii=False
Update 2 : Andrew Walker's useful link (in comments) leads me to think I can coerce my data into a convenient byte format before trying to json.encode it by doing something like :
data.encode("ascii","ignore")
Unfortunately that is throwing the same error.
回答1:
Try adding the argument: ensure_ascii = False
. Also especially if asking unicode-related issues it's very helpful to add a longer (complete) traceback and stating which python-version you are using.
Citing the python-documentation: of version 2.6.7 :
"If ensure_ascii is False (default: True), then some chunks written to fp may be unicode instances, subject to normal Python str to unicode coercion rules. Unless fp.write() explicitly understands unicode (as in codecs.getwriter()) this is likely to cause an error."
So this proposal may cause new problems, but it fixed a similar problem i had. I fed the resulting unicode-String into a StringIO-object and wrrote this to a file.
Because of python 2.7 and sys.getdefaultencoding set to ascii
the implicit conversion through the ''.join(chunks)
statement of the json-standard-library will blow up if chunks
is not ascii-encoded! You must ensure that any contained strings are converted to an ascii-compatible representation before-hand! You may try utf-8 encoded strings, but unicode-strings won't work if i'm not mistaken.
来源:https://stackoverflow.com/questions/9693699/python-json-and-unicode