calculating average using awk from multiple files

前端未结

关注

 4  1798

再見小時候

I have 500 files with name fort.1, fort.2 ... fort.500. Each file contains 800 data as below:

1 0.485
2 0.028
3 0.100

相关标签:

4条回答

佛祖请我去吃肉

2020-12-10 16:03
awk without any assumption on the 1st column:
```
awk '{a[FNR]+=$2;b[FNR]++;}END{for(i=1;i<=FNR;i++)print i,a[i]/b[i];}' fort.*
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
情深已故

2020-12-10 16:04
Assuming the first column is an ID:
```
cat fort.* | awk '{sum[$1] += $2; counts[$1]++;} END {for (i in sum) print i, sum[i]/counts[i];}' 
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
南旧

2020-12-10 16:13
My understanding: each file is a set of measurements at a particular location. You want to aggregate the measurements across all locations, averaging the value the same row in each file into a new file.

Assuming the first column can be treated as an ID for the row (and there are 800 measurements in a file):
```
cat fort.* | awk '
BEGIN { 
    for (i = 1; i <= 800; i++)
        total[i] = 0
}

{ total[$1] += $2 } 

END {
    for (i = 1; i <= 800; i++)
        print i, total[i]/500
}
'
```
First, we initialize an array to store the sum for a row across all files.

Then, we loop through the concatenated files. We use the first column as a key for the row, and we sum into the array.

Finally, we loop over the array and print the average value by row across all files.
0 讨论(0)
发布评论:

提交评论
- 加载中...
-上瘾入骨i

2020-12-10 16:19
Here's a quick way using paste and awk:
```
paste fort.* | awk '{ for(i=2;i<=NF;i+=2) array[$1]+=$i; if (i = NF) print $1, array[$1]/NF*2 }' > output.file
```
Like some of the other answers; here's another way but this one uses sort to get numerically sorted output:
```
awk '{ sum[$1]+=$2; cnt[$1]++ } END { for (i in sum) print i, sum[i]/cnt[i] | "sort -n" }' fort.*
```
0 讨论(0)
发布评论:

提交评论
- 加载中...