问题
This question already has an answer here:
- UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128) 27 answers
I have this code:
printinfo = title + "\t" + old_vendor_id + "\t" + apple_id + '\n'
# Write file
f.write (printinfo + '\n')
But I get this error when running it:
f.write(printinfo + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)
It's having toruble writing out this:
Identité secrète (Abduction) [VF]
Any ideas please, not sure how to fix.
Cheers.
UPDATE: This is the bulk of my code, so you can see what I am doing:
def runLookupEdit(self, event):
newpath1 = pathindir + "/"
errorFileOut = newpath1 + "REPORT.csv"
f = open(errorFileOut, 'w')
global old_vendor_id
for old_vendor_id in vendorIdsIn.splitlines():
writeErrorFile = 0
from lxml import etree
parser = etree.XMLParser(remove_blank_text=True) # makes pretty print work
path1 = os.path.join(pathindir, old_vendor_id)
path2 = path1 + ".itmsp"
path3 = os.path.join(path2, 'metadata.xml')
# Open and parse the xml file
cantFindError = 0
try:
with open(path3): pass
except IOError:
cantFindError = 1
errorMessage = old_vendor_id
self.Error(errorMessage)
break
tree = etree.parse(path3, parser)
root = tree.getroot()
for element in tree.xpath('//video/title'):
title = element.text
while '\n' in title:
title= title.replace('\n', ' ')
while '\t' in title:
title = title.replace('\t', ' ')
while ' ' in title:
title = title.replace(' ', ' ')
title = title.strip()
element.text = title
print title
#########################################
######## REMOVE UNWANTED TAGS ########
#########################################
# Remove the comment tags
comments = tree.xpath('//comment()')
q = 1
for c in comments:
p = c.getparent()
if q == 3:
apple_id = c.text
p.remove(c)
q = q+1
apple_id = apple_id.split(':',1)[1]
apple_id = apple_id.strip()
printinfo = title + "\t" + old_vendor_id + "\t" + apple_id
# Write file
# f.write (printinfo + '\n')
f.write(printinfo.encode('utf8') + '\n')
f.close()
回答1:
You need to encode Unicode explicitly before writing to a file, otherwise Python does it for you with the default ASCII codec.
Pick an encoding and stick with it:
f.write(printinfo.encode('utf8') + '\n')
or use io.open() to create a file object that'll encode for you as you write to the file:
import io
f = io.open(filename, 'w', encoding='utf8')
You may want to read:
The Python Unicode HOWTO
Pragmatic Unicode by Ned Batchelder
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky
before continuing.
来源:https://stackoverflow.com/questions/19833440/unicodeencodeerror-ascii-codec-cant-encode-character-u-xe9-in-position-7