How to get the biggest number in a file?

前端 未结 4 511
盖世英雄少女心
盖世英雄少女心 2020-12-04 00:41

I want to get the maximum number in a file, where numbers are integers that can occur in any place of the file.

I thought about doing the following:

         


        
4条回答
  •  北海茫月
    2020-12-04 01:22

    I suspect this will be fastest:

    $ tr ' ' '\n' < file | sort -rn | head -1
    42342234
    

    Third run:

    $ time tr ' ' '\n' < file | sort -rn | head -1
    42342234
    real    0m0.078s
    user    0m0.000s
    sys     0m0.076s
    

    btw DON'T WRITE SHELL LOOPS to manipulate text, even if it's creating sample input files:

    $ time awk -v s="$(cat a)" 'BEGIN{for (i=1;i<=50000;i++) print s}' > myfile
    
    real    0m0.109s
    user    0m0.031s
    sys     0m0.061s
    
    $ wc -l myfile
    150000 myfile
    

    compared to the shell loop suggested in the question:

    $ time for i in {1..50000}; do cat a >> myfile2 ; done
    
    real    26m38.771s
    user    1m44.765s
    sys     17m9.837s
    
    $ wc -l myfile2
    150000 myfile2
    

    If we want something that more robustly handles input files that contain digits in strings that are not integers, we need something like this:

    $ cat b
    hello 123 how are you i am fine 42342234 and blab bla bla
    and 3624 is another number
    but this is not enough for -23 234245
    73 starts a line
    avoid these: 3.14 or 4-5 or $15 or 2:30 or 05/12/2015
    
    $ grep -o -E '(^| )[-]?[0-9]+( |$)' b | sort -rn
     42342234
     3624
     123
    73
     -23
    
    $ time awk -v s="$(cat b)" 'BEGIN{for (i=1;i<=50000;i++) print s}' > myfileB
    real    0m0.109s
    user    0m0.000s
    sys     0m0.076s
    
    $ wc -l myfileB
    250000 myfileB
    
    $ time grep -o -E '(^| )-?[0-9]+( |$)' myfileB | sort -rn | head -1 | tr -d ' '
    42342234
    real    0m2.480s
    user    0m2.509s
    sys     0m0.108s
    

    Note that the input file has more lines than the original and with this input the above robust grep solution is actually faster than the original I posted at the start of this question:

    $ time tr ' ' '\n' < myfileB | sort -rn | head -1
    42342234
    real    0m4.836s
    user    0m4.445s
    sys     0m0.277s
    

提交回复
热议问题