Keep Track of Number of Bytes Read

问题

I would like to implement a command line progress bar for one of my programs IN PYTHON which reads text from a file line by line.

I can implement the progress scale in one of two ways:

(number of lines / total lines) or
(number of bytes completed / bytes total)

I don't care which, but "number of lines" would seem to require me to loop through the entire document (which could be VERY large) just to get the value for "total lines".

This seems extremely inefficient. I was thinking outside the box and thought perhaps if I took the size of the file (easier to get?) and kept track of the number of bytes that have been read, it might make for a good progress bar metric.

I can use os.path.getsize(file) or os.stat(file).st_size to retrieve the size of the file, but I have not yet found a way to keep track of the number of bytes read by readline(). The files I am working with should be encoded in ASCII, or maybe even Unicode, so... should I just determine the encoding used and then record the number of characters read or use os.getsizeof() or some len() function for each line read?

I am sure there will be problems here. Any suggestions?

(P.S. - I don't think manually inputting the number of bytes to read at a time will work, because I need to work with each line individually; or else I will need to split it up afterwards by "\n"'s.)

回答1:

bytesread = 0
while True:
  line = fh.readline()
  if line == '':
    break
  bytesread += len(line)

Or, a little shorter:

bytesread = 0
for line in fh:
  bytesread += len(line)

Using os.path.getsize() (or os.stat) is an efficient way of determining the file size.

来源：https://stackoverflow.com/questions/14423817/keep-track-of-number-of-bytes-read

标签

python

string

byte