Count the number of lines in a file without reading entire file into memory?

前端 未结 15 1454
忘掉有多难
忘掉有多难 2020-12-24 01:38

I\'m processing huge data files (millions of lines each).

Before I start processing I\'d like to get a count of the number of lines in the file, so I can then indic

15条回答
  •  被撕碎了的回忆
    2020-12-24 02:10

    If you are in a Unix environment, you can just let wc -l do the work.

    It will not load the whole file into memory; since it is optimized for streaming file and count word/line the performance is good enough rather then streaming the file yourself in Ruby.

    SSCCE:

    filename = 'a_file/somewhere.txt'
    line_count = `wc -l "#{filename}"`.strip.split(' ')[0].to_i
    p line_count
    

    Or if you want a collection of files passed on the command line:

    wc_output = `wc -l "#{ARGV.join('" "')}"`
    line_count = wc_output.match(/^ *([0-9]+) +total$/).captures[0].to_i
    p line_count
    

提交回复
热议问题