Change delimiter on “for each” loops on Strings in python

前端 未结 2 1638
小鲜肉
小鲜肉 2020-12-03 19:39

I need to read an input text file in python, by streaming line by line. That means load the text file line by line instead of all at once into memory. But my line delimiters

2条回答
  •  心在旅途
    2020-12-03 20:09

    Python doesn't have a native construct for this. You can write a generator that reads the characters one at a time and accumulates them until you have a whole delimited item.

    def items(infile, delim):
        item = []
        c = infile.read(1)
        while c:
            if c == delim:
                yield "".join(item)
                item = []
            else:
                c = infile.read(1)
                item.append(c)
        yield "".join(item)
    
    with open("log.txt") as infile:
        for item in items(infile, ","):   # comma delimited
            do_something_with(item)
    

    You will get better performance if you read the file in chunks (say, 64K or so) and split these. However, the logic for this is more complicated since an item may be split across chunks, so I won't go into it here as I'm not 100% sure I'd get it right. :-)

提交回复
热议问题