I have a python script that generates a bunch of data in a while loop. I need to write this data to a CSV file, so it writes by column rather than row.
For example in
Let's assume that (1) you don't have a large memory (2) you have row headings in a list (3) all the data values are floats; if they're all integers up to 32- or 64-bits worth, that's even better.
On a 32-bit Python, storing a float in a list takes 16 bytes for the float object and 4 bytes for a pointer in the list; total 20. Storing a float in an array.array('d') takes only 8 bytes. Increasingly spectacular savings are available if all your data are int (any negatives?) that will fit in 8, 4, 2 or 1 byte(s) -- especially on a recent Python where all ints are longs.
The following pseudocode assumes floats stored in array.array('d'). In case you don't really have a memory problem, you can still use this method; I've put in comments to indicate the changes needed if you want to use a list.
# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row:
hlist.append(some_heading_string)
dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
col_index += 1
for row_index in xrange(len(hlist)):
v = calculated_data_value(row_index, colindex)
dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
row = [hlist[row_index]]
row.extend(dlist[row_index])
csv_writer.writerow(row)