问题
Using awk, I would like to ignore case sensitve pattern when summarize lines based on the same pattern.
I have the following line (big thanks to Andrey (https://stackoverflow.com/users/3476320/andrey)
awk '{n=$1;$1="";a[$0]+=n}END{for(i in a){print a[i], i}}' testing.txt
The file contents:
1 Used cars
12 Drivers
1 used cars
1 used cars
14 drivers
2 Used Cars
the actual output is
2 Used Cars
14 drivers
12 Drivers
2 used cars
1 Used cars
What I need to have:
26 drivers/Drivers (doesn't matter)
5 used cars/Used Cars (doesn't matter)
Thank you!
回答1:
maybe the easiest way:
awk '{$0=tolower($0);n=$1;$1="";a[$0]+=n}END{for(i in a){print a[i], i}}' file
回答2:
From AWK Manual
One way to perform a case-insensitive match at a particular point in the program is to convert the data to a single case, using the tolower() or toupper() built-in string functions (which we haven’t discussed yet; see String Functions). For example:
tolower($1) ~ /foo/ { … }
Another method, specific to gawk, is to set the variable IGNORECASE to a nonzero value (see Built-in Variables). When IGNORECASE is not zero, all regexp and string operations ignore case.
Also note: In awk $1 is the first coloumn, $2 the second... $0 is the whole line. (You don't want to index the array with the whole line)
This is what works on my machine:
awk '{a[tolower($2) " " tolower($3)]+=$1;}END{for(i in a){print a[i], i}}' testing.txt
output:
5 used cars
26 drivers
来源:https://stackoverflow.com/questions/31562283/using-awk-ignore-casesensitve-pattern-when-summarize-lines-based-on-the-same-pa