I have a number of very large text files which I need to process, the largest being about 60GB.
Each line has 54 characters in seven fields and I want to remove the
As you don't seem to be limited by CPU, but rather by I/O, have you tried with some variations on the third parameter of open?
Indeed, this third parameter can be used to give the buffer size to be used for file operations!
Simply writing open( "filepath", "r", 16777216 ) will use 16 MB buffers when reading from the file. It must help.
Use the same for the output file, and measure/compare with identical file for the rest.
Note: This is the same kind of optimization suggested by other, but you can gain it here for free, without changing your code, without having to buffer yourself.