Explain awk command

后端 未结 5 488
谎友^
谎友^ 2020-11-28 06:29

Today I was searching for a command online to print next two lines after a pattern and I came across an awk command which I\'m unable to understand.

$ /usr/x         


        
相关标签:
5条回答
  • 2020-11-28 06:47

    Explanation

    awk expressions have the following form:

    condition action; NEXT_EXPRESSION
    

    If conditon is true action(s) will be executed. Further note, if condition is true but action has been omitted awk will execute print (the default action).

    You have two expressions in your code that will get executed on every line of input:

    _&&_--          ;
    /PATTERN/{_=2}
    

    Both are separated by a ;. As I told that default action print will happen if the action is omitted it is the same as:

    _&&_--    {print};
    /PATTERN/ {_=2}
    

    In your example _ is a variable name, which gets initialized by 0 on the first line of input, before it's first usage - automatically by awk.

    First condition would be (0) && (0).. What results in the condition being false, as 0 && 0 evaluates to false and awk will not print.

    If the pattern is found, _ will be set to 2 which makes the first condition being (2) && (2) on the next line and (1) && (1) on the next line after that line as _ is decremented after the condition has being evaluated. Both are evaluating to true and awk will print those lines.

    However, nice puzzle ;)

    0 讨论(0)
  • 2020-11-28 06:51

    Wonderfully obscure. Will update when time allows.

    _ is being used as a variable name. The && is a logical operator that has 2 true actions run together. Once the value of _ is reduced to zero, the 2nd half of the && is false and no output is generated.

    print -- "
    xxxxx
    yyyy
    PATTERN
    zzz
    aa
    bbb
    ccc
    ffffd" | awk '_&&_--;/PATTERN/{_=2}'
    

    output

    zzz
    aa
    

    debug version

    print -- "
    xxxxx
    yyyy
    PATTERN
    zzz
    aa
    bbb
    ccc
    ffffd" | awk '_&&_--;{print "_="_;print _&&_};/PATTERN/{_=2;print "_="_ }'
    

    output

    _=
    0
    _=
    0
    _=
    0
    _=
    0
    _=2
    zzz
    _=1
    1
    aa
    _=0
    0
    _=0
    0
    _=0
    0
    _=0
    0
    
    0 讨论(0)
  • 2020-11-28 07:06

    Simply put the command prints a number of lines after a given regular expression expression match not including the matched line.

    The number of lines is specified in the block {_=2} and the variable _ is set to 2 if the line matches PATTERN. Every line read after a matching line causes _ to be decremented. You can read _&&_-- as if _ is greater than zero then minus one from it, this happens for every line after a match until _ hits zero. It's quite simple when you replace the variable _ with a more sensible name like n.

    A simple demo should make it clear (print the 2 lines that follow any line matching foo):

    $ cat file
    foo
    1
    2
    3
    foo
    a
    b
    c
    
    $ awk 'n && n--;/foo/{n=2}' file
    1
    2
    a
    b
    

    So n is only True when it gets set to 2 after matching a line with foo then it decrements n and prints the current line. Due to awk having short circuit evaluation n is only decrement when n is True (n>0) so the only possible values in this case for n are 2,1 or 0.

    Awk has the following structure condition{block} and when a condition is evaluated True then block is executed for the current record. If you don't provide a block awk uses the default block {print $0} so n && n--; is a condition without a block that only evaluates to True for n lines after the regular expression match. The semi-colon just delimits the condition n&&n-- for the conditions /foo/ make it explicit that the condition has no block.

    To print the two lines following the match including the match you would do:

    $ awk '/foo/{n=3} n && n--' file
    foo
    1
    2
    foo
    a
    b
    

    Extra extra: the fact that the full path of /usr/xpg4/bin/awk is used tells me this code is intended for a Solaris machine as the /usr/bin/awk is totally broken and should be avoided at all costs.

    0 讨论(0)
  • 2020-11-28 07:08

    See https://stackoverflow.com/a/17914105/1745001 for the answer that was duplicated here.

    0 讨论(0)
  • 2020-11-28 07:09

    _ is being used as a variable name here (valid but obviously confusing). If you rewrite it as:

    awk 'x && x--; /PATTERN/ { x=2 }' input
    

    then it's a little easier to parse. Whenever /PATTERN/ is matched, the variable gets set to 2 (and that line is not output) - that's the second half. The first part fires when x is not zero, and decrements x as well as printing the current line (the default action, since that clause does not specify an action).

    The end result is to print the two lines immediately following any match of the pattern, as long as neither of those lines also matches the pattern.

    0 讨论(0)
提交回复
热议问题