awk extract multiple groups from each line

后端未结

关注

 4  2217

盖世英雄少女心

How do I perform action on all matching groups when the pattern matches multiple times in a line?

To illustrate, I want to search for /Hello! (\\d+)/ an

相关标签:

4条回答

一向

2020-12-06 12:54
This is gawk syntax. It also works for patterns when there's no fixed text that can work as a record separator and doesn't match over linefeeds:
```
 {
     pattern = "([a-g]+|[h-z]+)"
     while (match($0, pattern, arr))
     {
         val = arr[1]
         print val
         sub(pattern, "")
     }
 }
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

醉话见心

2020-12-06 13:02

This is a simple syntax, and every awk (nawk, mawk, gawk, etc) can use this.

{
    while (match($0, /Hello! [0-9]+/)) {
        pattern = substr($0, RSTART, RLENGTH);
        sub(/Hello! /, "", pattern);
        print pattern;
        $0 = substr($0, RSTART + RLENGTH);
    }
}

0 讨论(0)

一生所求

2020-12-06 13:02
There is no gawk function to match the same pattern multiple times in a line. Unless you know exactly how many times the pattern repeats.

Having this, you have to iterate "manually" on all matches in the same line. For your example input, it would be:
```
{
  from = 0
  pos = match( $0, /Hello! ([0-9]+)/, val )
  while( 0 < pos )
  {
    print val[1]
    from += pos + val[0, "length"]
    pos = match( substr( $0, from ), /Hello! ([0-9]+)/, val )
  }
}
```
If the pattern shall match over a linefeed, you have to modify the input record separator - RS
0 讨论(0)
发布评论:

提交评论
- 加载中...

[愿得一人]

2020-12-06 13:09

GNU awk

awk 'BEGIN{ RS="Hello! ";}
{
    gsub(/[^0-9].*/,"",$1)
    if ($1 != ""){ 
        print $1 
    }
}' file

0 讨论(0)