Python file.tell gives wrong value location

南楼画角 提交于 2019-11-30 15:32:34

The cause is (rather obscurely) explained in the docs for a file object's next() method:

When a file is used as an iterator, typically in a for loop (for example, for line in f: print line), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit. In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer.

The values returned by tell() reflect how far this hidden read-ahead buffer has gotten, which will typically be up to a few thousand bytes beyond the characters your program has actually retrieved.

There's no portable way around this. If you need to mix tell() with reading lines, then use the file's readline() method instead. The tradeoff is that, in return for getting usable tell() results, iterating over a large file with readline() is typically significantly slower than using for line in file_object:.

Code

Concretely, change the loop to this:

line = self.fh.readline()
while line:
    if p.search(line):
        self.porSnipStartFPtr = self.fh.tell()
        sys.stdout.write("found regPorSnip")
    line = fh.readline()

I'm not sure that's what you really want, though: tell() is capturing the position of the start of the next line. If want the position of the start of the line, then you need to change the logic, like so:

pos = self.fh.tell()
line = self.fh.readline()
while line:
    if p.search(line):
        self.porSnipStartFPtr = pos
        sys.stdout.write("found regPorSnip")
    pos = self.fh.tell()
    line = fh.readline()

or do it with a "loop and a half":

while True:
    pos = self.fh.tell()
    line = self.fh.readline()
    if not line:
        break
    if p.search(line):
        self.porSnipStartFPtr = pos
        sys.stdout.write("found regPorSnip")

I guess I dont understand the issue

>>> fh = open('test.txt')
>>> fh.tell()
0L
>>> fh.read(1)
'"'
>>> fh.tell()
1L
>>> fh.read(5)
'a" \n"'
>>> fh.tell()
7L
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!