问题
I am creating a sript to pre-analyze access logs from my website. So far I have been using awk to get desired data.
I need to be able to use awk to analyze the top URL, but only for a specific error code. (In this case 404)
Simplified log structure as follows:
'Request Method, URI, Error Code'
GET, /foo, 404
GET, /foo, 200
GET, /foo, 404
GET, /foo, 404
GET, /bar, 200
GET, /bar, 404
GET, /foobar, 404
GET, /foobar, 404
My desired output would be (Listing top URLS that have 404 error Code):
3 /foo
2 /foobar
1 /bar
回答1:
With awk
and sort
:
awk '$3==404{a[$2]++}END{for(url in a){print a[url], url}}' log.txt | sort -rn
回答2:
Here is one solution where the output of the array screening is directly sorted by awk.
awk 'BEGIN{ FS = ","; PROCINFO["sorted_in"] = "@val_str_desc" }{ if ($3 ~ "404") a[$2]++ }END{ for ( i in a ) print a[i], i }' yourfile
Output:
3 /foo
2 /foobar
1 /bar
回答3:
awk -F "," '{print $2,$3}' l.txt | grep -i "404" |awk '{print $1}'
来源:https://stackoverflow.com/questions/47213752/awk-find-the-top-url-based-on-error-code