awk: find minimum and maximum in column

前端 未结 4 413
慢半拍i
慢半拍i 2020-12-09 17:00

I\'m using awk to deal with a simple .dat file, which contains several lines of data and each line has 4 columns separated by a single space. I want to fin

相关标签:
4条回答
  • 2020-12-09 17:42

    late but a shorter command and with more precision without initial assumption:

      awk '(NR==1){Min=$1;Max=$1};(NR>=2){if(Min>$1) Min=$1;if(Max<$1) Max=$1} END {printf "The Min is %d ,Max is %d",Min,Max}' FileName.dat
    
    0 讨论(0)
  • 2020-12-09 17:51

    Your problem was simply that in your script you had:

    if ($1<a) a=$1 fi
    

    and that final fi is not part of awk syntax so it is treated as a variable so a=$1 fi is string concatenation and so you are TELLING awk that a contains a string, not a number and hence the string comparison instead of numeric in the $1<a.

    More importantly in general, never start with some guessed value for max/min, just use the first value read as the seed. Here's the correct way to write the script:

    $ cat tst.awk
    BEGIN { min = max = "NaN" }
    {
        min = (NR==1 || $1<min ? $1 : min)
        max = (NR==1 || $1>max ? $1 : max)
    }
    END { print min, max }
    
    $ awk -f tst.awk file
    4 12
    
    $ awk -f tst.awk /dev/null
    NaN NaN
    
    $ a=( $( awk -f tst.awk file ) )
    $ echo "${a[0]}"
    4
    $ echo "${a[1]}"
    12
    

    If you don't like NaN pick whatever you'd prefer to print when the input file is empty.

    0 讨论(0)
  • 2020-12-09 17:52

    Awk guesses the type.

    String "10" is less than string "4" because character "1" comes before "4". Force a type conversation, using addition of zero:

    min=`awk 'BEGIN{a=1000}{if ($1<0+a) a=$1} END{print a}' mydata.dat`
    max=`awk 'BEGIN{a=   0}{if ($1>0+a) a=$1} END{print a}' mydata.dat`
    
    0 讨论(0)
  • 2020-12-09 18:01

    a non-awk answer:

    cut -d" " -f1 file |
    sort -n |
    tee >(echo "min=$(head -1)") \
      > >(echo "max=$(tail -1)")
    

    That tee command is perhaps a bit much too clever. tee duplicates its stdin stream to the files names as arguments, plus it streams the same data to stdout. I'm using process substitutions to filter the streams.

    The same effect can be used (with less flourish) to extract the first and last lines of a stream of data:

    cut -d" " -f1 file | sort -n | sed -n '1s/^/min=/p; $s/^/max=/p'
    

    or

    cut -d" " -f1 file | sort -n | { 
        read line
        echo "min=$line"
        while read line; do max=$line; done
        echo "max=$max"
    }
    
    0 讨论(0)
提交回复
热议问题