Fastest way to print a single line in a file

前端 未结 5 902
礼貌的吻别
礼貌的吻别 2021-02-02 12:44

I have to fetch one specific line out of a big file (1500000 lines), multiple times in a loop over multiple files, I was asking my self what would be the best option

5条回答
  •  天涯浪人
    2021-02-02 13:23

    Drop the useless use of cat and do:

    $ sed -n '1{p;q}' file
    

    This will quit the sed script after the line has been printed.


    Benchmarking script:

    #!/bin/bash
    
    TIMEFORMAT='%3R'
    n=25
    heading=('head -1 file' 'sed -n 1p file' "sed -n '1{p;q} file" 'read line < file && echo $line')
    
    # files upto a hundred million lines (if your on slow machine decrease!!)
    for (( j=1; j<=100,000,000;j=j*10 ))
    do
        echo "Lines in file: $j"
        # create file containing j lines
        seq 1 $j > file
        # initial read of file
        cat file > /dev/null
    
        for comm in {0..3}
        do
            avg=0
            echo
            echo ${heading[$comm]}    
            for (( i=1; i<=$n; i++ ))
            do
                case $comm in
                    0)
                        t=$( { time head -1 file > /dev/null; } 2>&1);;
                    1)
                        t=$( { time sed -n 1p file > /dev/null; } 2>&1);;
                    2)
                        t=$( { time sed '1{p;q}' file > /dev/null; } 2>&1);;
                    3)
                        t=$( { time read line < file && echo $line > /dev/null; } 2>&1);;
                esac
                avg=$avg+$t
            done
            echo "scale=3;($avg)/$n" | bc
        done
    done
    

    Just save as benchmark.sh and run bash benchmark.sh.

    Results:

    head -1 file
    .001
    
    sed -n 1p file
    .048
    
    sed -n '1{p;q} file
    .002
    
    read line < file && echo $line
    0
    

    **Results from file with 1,000,000 lines.*

    So the times for sed -n 1p will grow linearly with the length of the file but the timing for the other variations will be constant (and negligible) as they all quit after reading the first line:

    enter image description here

    Note: timings are different from original post due to being on a faster Linux box.

提交回复
热议问题