How to read tokens without reading whole line or file

后端 未结 4 1846
别那么骄傲
别那么骄傲 2020-12-02 00:51

Is there a well-hidden way to read tokens from a file or file-like object without reading entire lines? The application I immediately have (someone else\'s problem

4条回答
  •  一整个雨季
    2020-12-02 01:47

    Here is a generator that processes a file one character at a time and yields tokens when whitespace is encountered.

    def generate_tokens(path):
        with open(path, 'r') as fp:
            buf = []
            while True:
                ch = fp.read(1)
                if ch == '':
                    break
                elif ch.isspace():
                    if buf:
                        yield ''.join(buf)
                        buf = []
                else:
                    buf.append(ch)
    
    if __name__ == '__main__':
        for token in generate_tokens('input.txt'):
            print token
    

    To be more generic, it looks like you might be able to use the re module as described at this link. Just feed the input with a generator from your file to avoid reading the whole file at once.

    Python equivalent of ruby's StringScanner?

提交回复
热议问题