How to group by column values into Row and column header and then sum the value

匿名 (未验证) 提交于 2019-12-03 02:42:02

问题:

Below is my input and output .txt files.

I want to group by the data by StatusDate and Method. And then sum the values based on the StatusDate and Method.

Input.txt

No,Date,MethodStatus,Key,StatusDate,Hit,CallType,Method,LastMethodType 112,12/15/16,Suceess,Geo,12/15/16,1,Static,GET,12/15/16 113,12/18/16,Suceess,Geo,12/18/16,1,Static,GET,12/18/16 114,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16 115,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16 116,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16 117,12/19/16,Suceess,Geo,12/19/16,1,Static,PUT,12/19/16 118,12/19/16,Waiting,Geo,12/19/16,1,Static,GET,12/19/16 119,12/19/16,AUTHORIZED,Geo,12/19/16,1,Static,GET,12/19/16 120,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16 121,12/17/16,Suceess,Geo,12/17/16,1,Static,GET,12/17/16 130,12/16/16,Suceess,Geo,12/16/16,1,Static,GET,12/16/16 

Out.txt

StatusDate,12/15/16,12/16/16,12/17/16,12/17/16,12/18/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,12/19/16,Grand Total GET,1,1,1,1,1,1,1,1,1,,,9 PUT,,,,,,,,,,1,1,2 Grand Total,1,1,1,1,1,1,1,1,1,1,1,11 

I'm using awk and splitting the data by awk -F, '{if($8=="GET") print }', then calculating the sum value. Since the file size is huge, there is a delay.

Is it possible to do everything in one step? So the file operation will be reduced?

回答1:

awk -F ',' '    /GET/{ Get[ $2]++}    /PUT}{ Put[ $2]++}    {Total[$2]++}     END {      printf( "StatusDate')      for (d in Total) printf( ",%s", d)      printf( "\nTotal GET")      for (d in Total) printf( ",%d", Get[d])      printf( " PUT")      for (d in Total) printf( ",%d", Put[d])      printf( " Grand Total")      for (d in Total) printf( ",%d", Total[d])      printf( "\n")      }    ' Input.txt 
  • store data in 3 array at each line
  • several loop to disply the result at the end


回答2:

You can use associative array to keep track of aggregations.

NR == 1 {     next }  $8 == "GET" {     Get[$2]++     g++     Total[$2]++     next }  $8 == "PUT" {     Put[$2]++     p++     Total[$2]++     next }  END {     printf( "StatusDate")     for (d in Total) printf( ",%s", d)     printf( ",Grand Total\nGET")     for (d in Total) printf( ",%d", Get[d])     printf( ",%d\nPUT", g)     for (d in Total) printf( ",%d", Put[d])     printf( ",%d\nGrand Total",p)     for (d in Total) printf( ",%d", Total[d])     printf( ",%d\n",g+p) } 


回答3:

$ awk -F, 'NR==1 {printf "%s",$8$2; t="Total"; v[t]; next}                   {d[$2]; v[$8]; a[$2,$8]++; a[$2,t]++}             END   {n=asorti(d,ds);                    for(i=1;i<=n;i++) printf "%s", FS ds[i];                    print "";                    for(k in v)                      {printf "%s", k;                       for(i=1;i<=n;i++) printf "%s", FS a[ds[i],k];                       print ""}}' file   MethodDate,12/15/16,12/16/16,12/17/16,12/18/16,12/19/16 PUT,,,,,2 GET,1,1,2,1,4 Total,1,1,2,1,6 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!