Reading a csv file in Python with different line terminator

ε祈祈猫儿з 提交于 2021-02-07 03:45:26

问题


I have a file in CSV format where the delimiter is the ASCII unit separator ^_ and the line terminator is the ASCII record separator ^^ (obviously, since these are nonprinting characters, I've just used one of the standard ways of writing them here). I've written plenty of code that reads and writes CSV files, so my issue isn't with Python's csv module per se. The problem is that the csv module doesn't support reading (but it does support writing) line terminators other than a carriage return or line feed, at least as of Python 2.6 where I just tested it. The documentation says that this is because it's hard coded, which I take to mean it's done in the C code that underlies the module, since I didn't see anything in the csv.py file that I could change.

Does anyone know a way around this limitation (patch, another CSV module, etc.)? I really need to read in a file where I can't use carriage returns or new lines as the line terminator because those characters will appear in some of the fields, and I'd like to avoid writing my own custom reader code if possible, even though that would be rather simple to meet my needs.


回答1:


Why not supply a custom iterable to the csv.reader function? Here is a naive implementation which reads the entire contents of the CSV file into memory at once (which may or may not be desirable, depending on the size of the file):

def records(path):
    with open(path) as f:
        contents = f.read()
        return (record for record in contents.split('^^'))

csv.reader(records('input.csv'))

I think that should work.



来源:https://stackoverflow.com/questions/1770934/reading-a-csv-file-in-python-with-different-line-terminator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!