I have below data named atp.csv file
Date_Time,M_ID,N_ID,Status,Desc,AMount,Type
2015-01-05 00:00:00 076,1941321748,BD9010423590206,200,Transaction Successfu
AWK has associative arrays.
% cat atp.csv | awk -F, 'NR>1 {n[$4]+=1;s[$4]+=$6;} END {for (k in n) { print k "," n[k] "," s[k]; }}' | sort
200,3,4500
351,1,5000
In the above:
The first line (record) is skipped with NR>1
.
n[k]
is the number of occurrences of key k
(so we add 1), and s[k]
is the running sum values in field 6 (so we add $6
).
Finally, after all records are processed (END
), you can iterate over associated arrays by key (for (k in n) { ... }
) and print the keys and values in arrays n
and s
associated with the key.