I have copied this script from [python web site][1] This is another question but now problem with encoding:
import sqlite3
import csv
import codecs
import cS
Then I converted all integers to string,
You converted both integers and strings to byte strings. For strings this will use the default character encoding which happens to be ASCII, and this fails when you have non-ASCII characters. You want unicode
instead of str
.
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
It might be better to convert everything to unicode before calling that method. The class is designed specifically for parsing Unicode strings. It was not designed to support other data types.
If you are using Python 2:
make encoding as : str(s.encode("utf-8")) i.e.
def writerow(self, row):
self.writer.writerow([str(s.encode("utf-8")) for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
From the documentation:
Unlike the StringIO module, this module is not able to accept Unicode strings that cannot be encoded as plain ASCII strings.
I.e. only 7-bit clean strings can be stored.