Using awk, ignore casesensitve pattern when summarize lines based on the same pattern

最后都变了- 提交于 2019-12-13 05:52:20

问题


Using awk, I would like to ignore case sensitve pattern when summarize lines based on the same pattern.

I have the following line (big thanks to Andrey (https://stackoverflow.com/users/3476320/andrey)

awk '{n=$1;$1="";a[$0]+=n}END{for(i in a){print a[i], i}}' testing.txt

The file contents:

1 Used cars
12 Drivers
1 used cars
1 used  cars
14 drivers
2 Used Cars

the actual output is

2  Used Cars
14  drivers
12  Drivers
2  used cars
1  Used cars

What I need to have:

26 drivers/Drivers (doesn't matter)
5 used cars/Used Cars (doesn't matter)

Thank you!


回答1:


maybe the easiest way:

awk  '{$0=tolower($0);n=$1;$1="";a[$0]+=n}END{for(i in a){print a[i], i}}' file



回答2:


From AWK Manual

One way to perform a case-insensitive match at a particular point in the program is to convert the data to a single case, using the tolower() or toupper() built-in string functions (which we haven’t discussed yet; see String Functions). For example:

tolower($1) ~ /foo/ { … }

Another method, specific to gawk, is to set the variable IGNORECASE to a nonzero value (see Built-in Variables). When IGNORECASE is not zero, all regexp and string operations ignore case.

Also note: In awk $1 is the first coloumn, $2 the second... $0 is the whole line. (You don't want to index the array with the whole line)

This is what works on my machine:

awk '{a[tolower($2) " " tolower($3)]+=$1;}END{for(i in a){print a[i], i}}' testing.txt

output:

5 used cars
26 drivers


来源:https://stackoverflow.com/questions/31562283/using-awk-ignore-casesensitve-pattern-when-summarize-lines-based-on-the-same-pa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!