extract data from log file in specified range of time with awk getline bash

て烟熏妆下的殇ゞ 提交于 2019-12-06 02:10:37

Basing from your input here, you could use a script like this:

#!/bin/bash

LOGFILE=/path/to/logfile

X=$(( 60 * 60 )) ## 1 Hour

function get_ts {
    DATE="${1%%\]*}"; DATE="${DATE##*\[}"; DATE=${DATE/:/ }; DATE=${DATE//\// }
    TS=$(date -d "$DATE" '+%s')
}

get_ts "$(tail -n 1 "$LOGFILE")"
LAST=$TS

while read -r LINE; do
    get_ts "$LINE"
    (( (LAST - TS) <= X )) && echo "$LINE"
done < "$LOGFILE"

Save it to a file and change the value for LOGFILE, then run with bash script.sh.

Example output:

157.55.34.99 - - [06/Sep/2013:09:13:10 +0300] "GET /index.php HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
85.163.134.149 - - [06/Sep/2013:09:50:23 +0300] "GET /wap/wapicons/mnrwap.jpg HTTP/1.1" 200 1217 "http://mydomain.com/main.php" "Mozilla/5.0 (Linux; U; Android 4.1.2; en-gb; GT-I9082 Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"

The problem is probably just that you're not quoting your shell variables. Look:

$ foo='ab cd'

$ awk -v bar="$foo" 'BEGIN{print bar}'
ab cd

$ awk -v bar=$foo 'BEGIN{print bar}'
awk: fatal: cannot open file `BEGIN{print bar}' for reading (No such file or directory)

Yes, I know that's a different error message - what happens when you leave shell variables unquoted can by any number of things depending on the value of the variable, contents of your directory, etc., some of them VERY bad like removing every file in your filesystem.

So, quote your variables:

-v last="$last" -v x="$x"

then see if you still have the problem.

By the way, here's how to really solve your original problem using GNU awk with the input file http://pastebin.com/BXmS4zLn:

$ cat tst.awk
BEGIN {
    ARGV[ARGC++] = ARGV[ARGC-1]

    mths = "JanFebMarAprMayJunJulAugSepOctNovDec"

    if (days)  { hours = days * 24  }
    if (hours) { mins  = hours * 60 }
    if (mins)  { secs  = mins * 60  }
    deltaSecs = secs
}

NR==FNR {
    nr2secs[NR] = mktime($6" "(match(mths,$5)+2)/3" "$4" "gensub(/:/," ","g",$7))
    next
}

nr2secs[FNR] >= (nr2secs[NR-FNR] - deltaSecs)

$ awk -v hours=1 -f tst.awk file
157.55.34.99 - -  06 Sep 2013 09:13:10 +0300  "GET /index.php HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
85.163.134.149 - -  06 Sep 2013 09:50:23 +0300  "GET /wap/wapicons/mnrwap.jpg HTTP/1.1" 200 1217 "http://mydomain.com/main.php" "Mozilla/5.0 (Linux; U; Android 4.1.2; en-gb; GT-I9082 Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
83.113.48.218 - -  06 Sep 2013 10:13:07 +0300  "GET /english/nicons/word.gif HTTP/1.1" 200 803 "http://mydomain.com/french/details.php?eid=127928&cid=18&fromval=1&frid=18" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

$ gawk -v mins=60 -f tst.awk file
157.55.34.99 - -  06 Sep 2013 09:13:10 +0300  "GET /index.php HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
85.163.134.149 - -  06 Sep 2013 09:50:23 +0300  "GET /wap/wapicons/mnrwap.jpg HTTP/1.1" 200 1217 "http://mydomain.com/main.php" "Mozilla/5.0 (Linux; U; Android 4.1.2; en-gb; GT-I9082 Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
83.113.48.218 - -  06 Sep 2013 10:13:07 +0300  "GET /english/nicons/word.gif HTTP/1.1" 200 803 "http://mydomain.com/french/details.php?eid=127928&cid=18&fromval=1&frid=18" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

$ gawk -v mins=20 -f tst.awk file
83.113.48.218 - -  06 Sep 2013 10:13:07 +0300  "GET /english/nicons/word.gif HTTP/1.1" 200 803 "http://mydomain.com/french/details.php?eid=127928&cid=18&fromval=1&frid=18" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"

You can specify days= or hours= or mins= or secs= variables and it'll do the right thing.

If you only need a script to get the last 1 mins worth of log lines as your question states (now?), and want to see a one-liner to do it:

$ gawk 'NR==FNR {nr2secs[++nr] = mktime($6" "(match("JanFebMarAprMayJunJulAugSepOctNovDec",$5)+2)/3" "$4" "gensub(/:/," ","g",$7)); next} nr2secs[FNR] >= (nr2secs[nr] - 60)' file file
83.113.48.218 - -  06 Sep 2013 10:13:07 +0300  "GET /english/nicons/word.gif HTTP/1.1" 200 803 "http://mydomain.com/french/details.php?eid=127928&cid=18&fromval=1&frid=18" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0)"
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!