How do strongly typed regexp constants work in GNU Awk?

心已入冬 提交于 2021-02-07 17:29:22

问题


Strongly typed regexp constants is a handy tool that GNU Awk has. It is documented in GNU Awk User's Guide -> 6.1.2.2 Strongly Typed Regexp Constants and in there you can find interesting examples.

From reading it, and comments to an answer I made up some examples that show those:

$ cat file
he;llo
ho
are
you;
$ gawk -v patt='@/;/' '$0 ~ patt' file  # it prints those lines containing ";"
he;llo
you;

In this case, we pass the pattern ";" with @/;/ and so it prints all the lines that contain ";" in them.

Now I wanted to go one step further and set this parameter dynamically. For example, by placing it on the first line of the file to read:

$ cat file
;
line2: hello;how
line3: are|you?
line4: i am;fine

However, I cannot manage to set the pattern to the string contained in $0, and I tried in various ways:

gawk 'NR==1 {f=@""$0; next} $0 ~ f' file
gawk 'NR==1 {f=@$0; next} $0 ~ f' file

But they all return a syntax error:

gawk: cmd. line:1: NR==1 {f=@$0; next} $0 ~ f
gawk: cmd. line:1:           ^ syntax error

In this case, ";" is set as the pattern to match against and I would expect it to be processing the regexp from the 2nd line, and thus matching line 2 and 4, as if we would do gawk 'NR==1 {f=@/;/; next} $0 ~ f'. However, I cannot set the strongly typed regexp constant dynamically.

Is there a way to do so?


回答1:


wrt I cannot set the strongly typed regexp constant dynamically - you could replace "strongly typed regexp" with any other string in that statement and it'd still be true since by definition you cannot set constants dynamically, "constant" and "dynamic" are mutually exclusive.

Strongly typed regexp constants are mainly useful for passing a literal regexp to a user-defined function (which you cannot do with a regular regexp constant):

$ awk 'function foo(x){print x, typeof(x)} BEGIN{foo(/bar/)}'
awk: cmd. line:1: warning: regexp constant for parameter #1 yields boolean value
0 number

$ awk 'function foo(x){print x, typeof(x)} BEGIN{foo("bar")}'
bar string

$ awk 'function foo(x){print x, typeof(x)} BEGIN{foo(@/bar/)}'
bar regexp

and so you don't need an extra layer of escapes like you do with dynamic regexps as awk doesn't have to convert the expression to a regexp first before using it as such:

$ echo 'a.b a#b' | awk 'BEGIN{old="a\.b"; new="_"} {gsub(old,new)} 1'
awk: cmd. line:1: warning: escape sequence `\.' treated as plain `.'
_ _

$ echo 'a.b a#b' | awk 'BEGIN{old="a\\.b"; new="_"} {gsub(old,new)} 1'
_ a#b

$ echo 'a.b a#b' | awk 'BEGIN{old=@/a\.b/; new="_"} {gsub(old,new)} 1'
_ a#b

What you are trying to do with the example in your question is set the regexp dynamically and so that requires a dynamic (i.e. one specified as a string) instead of a constant regexp:

$ awk 'NR==1{f=$0; next} $0 ~ f' file
line2: hello;how
line4: i am;fine


来源:https://stackoverflow.com/questions/65617751/how-do-strongly-typed-regexp-constants-work-in-gnu-awk

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!