The Python docs on file.read() state that An empty string is returned when EOF is encountered immediately.
The documentation further states:
You are not thinking with your snake skin on... Python is not C.
First, a review:
n
bytes and in no case more than n
bytes;If a file read method is at EOF, it returns ''
. The same type of EOF test is used in the other 'file like" methods like StringIO, socket.makefile, etc. A return of less than n
bytes from f.read(n)
is most assuredly NOT a dispositive test for EOF! While that code may work 99.99% of the time, it is the times it does not work that would be very frustrating to find. Plus, it is bad Python form. The only use for n
in this case is to put an upper limit on the size of the return.
What are some of the reasons the Python file-like methods returns less than n
bytes?
n
bytes may cause a break between logical multi-byte characters (such as \r\n
in text mode and, I think, a multi-byte character in Unicode) or some underlying data structure not known to you;I would rewrite your code in this manner:
with open(filename,'rb') as f:
while True:
s=f.read(max_size)
if not s: break
# process the data in s...
Or, write a generator:
def blocks(infile, bufsize=1024):
while True:
try:
data=infile.read(bufsize)
if data:
yield data
else:
break
except IOError as (errno, strerror):
print "I/O error({0}): {1}".format(errno, strerror)
break
f=open('somefile','rb')
for block in blocks(f,2**16):
# process a block that COULD be up to 65,536 bytes long