csv reader behavior with None and empty string

后端 未结 7 1470
夕颜
夕颜 2020-12-01 13:59

I\'d like to distinguishing None and empty strings when going back and forth between Python data structure and csv representation using Python\'s csv

7条回答
  •  -上瘾入骨i
    2020-12-01 14:41

    As others have pointed out you can't really do this via csv.Dialect or parameters to csv.writer and/or csv.reader. However as I said in one comment, you implement it by effectively subclassing the latter two (you apparently can't really do because they're built-in). What the "subclasses" do on writing is simply intercept None values and change them into a unique string and reverse the process when reading them back in. Here's a fully worked-out example:

    import csv, cStringIO
    NULL = ''  # something unlikely to ever appear as a regular value in your csv files
    
    class MyCsvWriter(object):
        def __init__(self, *args, **kwrds):
            self.csv_writer = csv.writer(*args, **kwrds)
    
        def __getattr__(self, name):
            return getattr(self.csv_writer, name)
    
        def writerow(self, row):
            self.csv_writer.writerow([item if item is not None else NULL
                                          for item in row])
        def writerows(self, rows):
            for row in rows:
                self.writerow(row)
    
    class MyCsvReader(object):
        def __init__(self, *args, **kwrds):
            self.csv_reader = csv.reader(*args, **kwrds)
    
        def __getattr__(self, name):
            return getattr(self.csv_reader, name)
    
        def __iter__(self):
            rows = iter(self.csv_reader)
            for row in rows:
                yield [item if item != NULL else None for item in row]
    
    data = [['NULL/None value', None],
            ['empty string', '']]
    
    f = cStringIO.StringIO()
    MyCsvWriter(f).writerows(data)  # instead of csv.writer(f).writerows(data)
    
    f = cStringIO.StringIO(f.getvalue())
    data2 = [e for e in MyCsvReader(f)]  # instead of [e for e in csv.reader(f)]
    
    print "input : ", data
    print "ouput : ", data2
    

    Output:

    input :  [['NULL/None value', None], ['empty string', '']]
    ouput :  [['NULL/None value', None], ['empty string', '']]
    

    It's a tad verbose and probably slows the reading & writing of csv file a bit (since they're written in C/C++) but that may make little difference since the process is likely low-level I/O bound anyway.

提交回复
热议问题