A simple program for reading a CSV file inside a zip file works in Python 2.7, but not in Python 3.2
$ cat test_zip_file_py3k.py
import csv, sys, zipfile
z
Lennart's answer is on the right track (Thanks, Lennart, I voted up your answer) and it almost works:
$ cat test_zip_file_py3k.py
import csv, io, sys, zipfile
zip_file = zipfile.ZipFile(sys.argv[1])
items_file = zip_file.open('items.csv', 'rU')
items_file = io.TextIOWrapper(items_file, encoding='iso-8859-1', newline='')
for idx, row in enumerate(csv.DictReader(items_file)):
print('Processing row {0}'.format(idx))
$ python3.1 test_zip_file_py3k.py ~/data.zip
Traceback (most recent call last):
File "test_zip_file_py3k.py", line 7, in
items_file = io.TextIOWrapper(items_file,
encoding='iso-8859-1',
newline='')
AttributeError: readable
The problem appears to be that io.TextWrapper's first required parameter is a buffer; not a file object.
This appears to work:
items_file = io.TextIOWrapper(io.BytesIO(items_file.read()))
This seems a little complex and also it seems annoying to have to read in a whole (perhaps huge) zip file into memory. Any better way?
Here it is in action:
$ cat test_zip_file_py3k.py
import csv, io, sys, zipfile
zip_file = zipfile.ZipFile(sys.argv[1])
items_file = zip_file.open('items.csv', 'rU')
items_file = io.TextIOWrapper(io.BytesIO(items_file.read()))
for idx, row in enumerate(csv.DictReader(items_file)):
print('Processing row {0}'.format(idx))
$ python3.1 test_zip_file_py3k.py ~/data.zip
Processing row 0
Processing row 1
Processing row 2
...
Processing row 250