I\'m processing huge data files (millions of lines each).
Before I start processing I\'d like to get a count of the number of lines in the file, so I can then indic
wc -l in Ruby with less memory, the lazy way:
(ARGV.length == 0 ?
[["", STDIN]] :
ARGV.lazy.map { |file_name|
[file_name, File.open(file_name)]
})
.map { |file_name, file|
"%8d %s\n" % [*file
.each_line
.lazy
.map { |line| 1 }
.reduce(:+), file_name]
}
.each(&:display)
as originally shown by Shugo Maeda.
Example:
$ curl -s -o wc.rb -L https://git.io/vVrQi
$ chmod u+x wc.rb
$ ./wc.rb huge_data_file.csv
43217291 huge_data_file.csv