Open() and codecs.open() in Python 2.7 behave strangely different

后端 未结 1 818
南方客
南方客 2020-12-19 06:23

I have a text file with first line of unicode characters and all other lines in ASCII. I try to read the first line as one variable, and all other lines as another. However,

相关标签:
1条回答
  • 2020-12-19 06:55

    Because you used .readline() first, the codecs.open() file has filled a linebuffer; the subsequent call to .readlines() returns only the buffered lines.

    If you call .readlines() again, the rest of the lines are returned:

    >>> f = codecs.open(filename, 'r3', encoding='utf-8')
    >>> line = f.readline()
    >>> len(f.readlines())
    7
    >>> len(f.readlines())
    71
    

    The work-around is to not mix .readline() and .readlines():

    f = codecs.open(filename, 'r3', encoding='utf-8')
    data_f = f.readlines()
    names_f = data_f.pop(0).split(' ')  # take the first line.
    

    This behaviour is really a bug; the Python devs are aware of it, see issue 8260.

    The other option is to use io.open() instead of codecs.open(); the io library is what Python 3 uses to implement the built-in open() function and is a lot more robust and versatile than the codecs module.

    0 讨论(0)
提交回复
热议问题