calculating average using awk from multiple files

前端 未结 4 1798
再見小時候
再見小時候 2020-12-10 15:57

I have 500 files with name fort.1, fort.2 ... fort.500. Each file contains 800 data as below:

1 0.485
2 0.028
3 0.100

相关标签:
4条回答
  • 2020-12-10 16:03

    awk without any assumption on the 1st column:

    awk '{a[FNR]+=$2;b[FNR]++;}END{for(i=1;i<=FNR;i++)print i,a[i]/b[i];}' fort.*
    
    0 讨论(0)
  • 2020-12-10 16:04

    Assuming the first column is an ID:

    cat fort.* | awk '{sum[$1] += $2; counts[$1]++;} END {for (i in sum) print i, sum[i]/counts[i];}' 
    
    0 讨论(0)
  • 2020-12-10 16:13

    My understanding: each file is a set of measurements at a particular location. You want to aggregate the measurements across all locations, averaging the value the same row in each file into a new file.

    Assuming the first column can be treated as an ID for the row (and there are 800 measurements in a file):

    cat fort.* | awk '
    BEGIN { 
        for (i = 1; i <= 800; i++)
            total[i] = 0
    }
    
    { total[$1] += $2 } 
    
    END {
        for (i = 1; i <= 800; i++)
            print i, total[i]/500
    }
    '
    

    First, we initialize an array to store the sum for a row across all files.

    Then, we loop through the concatenated files. We use the first column as a key for the row, and we sum into the array.

    Finally, we loop over the array and print the average value by row across all files.

    0 讨论(0)
  • 2020-12-10 16:19

    Here's a quick way using paste and awk:

    paste fort.* | awk '{ for(i=2;i<=NF;i+=2) array[$1]+=$i; if (i = NF) print $1, array[$1]/NF*2 }' > output.file
    

    Like some of the other answers; here's another way but this one uses sort to get numerically sorted output:

    awk '{ sum[$1]+=$2; cnt[$1]++ } END { for (i in sum) print i, sum[i]/cnt[i] | "sort -n" }' fort.*
    
    0 讨论(0)
提交回复
热议问题