Count the number of lines in a file without reading entire file into memory?

前端 未结 15 1395
忘掉有多难
忘掉有多难 2020-12-24 01:38

I\'m processing huge data files (millions of lines each).

Before I start processing I\'d like to get a count of the number of lines in the file, so I can then indic

15条回答
  •  [愿得一人]
    2020-12-24 02:16

    Summary of the posted solutions

    require 'benchmark'
    require 'csv'
    
    filename = "name.csv"
    
    Benchmark.bm do |x|
      x.report { `wc -l < #{filename}`.to_i }
      x.report { File.open(filename).inject(0) { |c, line| c + 1 } }
      x.report { File.foreach(filename).inject(0) {|c, line| c+1} }
      x.report { File.read(filename).scan(/\n/).count }
      x.report { CSV.open(filename, "r").readlines.count }
    end
    

    File with 807802 lines:

           user     system      total        real
       0.000000   0.000000   0.010000 (  0.030606)
       0.370000   0.050000   0.420000 (  0.412472)
       0.360000   0.010000   0.370000 (  0.374642)
       0.290000   0.020000   0.310000 (  0.315488)
       3.190000   0.060000   3.250000 (  3.245171)
    

提交回复
热议问题