Using grep to get the next WORD after a match in each line

前端 未结 6 1325
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-30 07:09

I want to get the \"GET\" queries from my server logs.

For example, this is the server log

1.0.0.127.in-addr.arpa - - [10/Jun/2012          


        
相关标签:
6条回答
  • 2020-11-30 07:21

    It's often easier to use a pipeline rather than a single complex regular expression. This works on the data you provided:

    fgrep GET /tmp/foo | 
        egrep -o 'GET (.*) HTTP' |
        sed -r 's/^GET \/(.+) HTTP/\1/'
    

    This pipeline returns the following results:

    hello
    ss
    

    There are certainly other ways to get the job done, but this patently works on the provided corpus.

    0 讨论(0)
  • 2020-11-30 07:23

    In this case since the log file has a known structure, one option is to use cut to pull out the 7th column (fields are denoted by tabs by default).

    grep GET log.txt | cut -f 7 
    
    0 讨论(0)
  • 2020-11-30 07:29

    I was trying to do this and came across this link: https://www.unix.com/shell-programming-and-scripting/153101-print-next-word-after-found-pattern.html

    Summary: use grep to find matching lines, then use awk to find the pattern and print the next field:

    grep pattern logfile | \
      awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}'
    

    If you want to know the unique occurrences:

    grep pattern logfile | \
      awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}' | \
      sort | \
      uniq -c
    
    0 讨论(0)
  • 2020-11-30 07:31

    Assuming you have gnu grep, you can use perl-style regex to do a positive lookbehind:

    grep -oP '(?<=GET\s/)\w+' file
    

    If you don't have gnu grep, then I'd advise just using sed:

    sed -n '/^.*GET[[:space:]]\{1,\}\/\([-_[:alnum:]]\{1,\}\).*$/s//\1/p' file
    

    If you happen to have gnu sed, that can be greatly simplified:

    sed -n '/^.*GET\s\+\/\(\w\+\).*$/s//\1/p' file
    

    The bottom line here is, you certainly don't need pipes to accomplish this. grep or sed alone will suffice.

    0 讨论(0)
  • 2020-11-30 07:38
    gawk '{match($7,/\/(\w+)/,a);} length(a[1]){print a[1]}' log.txt
    hello
    ss
    

    If you have gawk then above command will use match function to select the desired value using regex and storing it to an array a.

    0 讨论(0)
  • 2020-11-30 07:40

    use a pipe if you use grep:

    grep -o /he.* log.txt | grep -o [^/].*
    grep -o /ss log.txt | grep -o [^/].*
    

    [^/] means extract the letters after ^ symbol from the grep output

    0 讨论(0)
提交回复
热议问题