posix-ere

Greedy behaviour of grep

回眸只為那壹抹淺笑 提交于 2020-06-27 10:27:11
问题 I thought that in regular expressions, the "greediness" applies to quantifiers rather than matches as a whole. However, I observe that grep -E --color=auto 'a+(ab)?' <(printf "aab") returns aab rather than aa b. The same applies to sed. On the other hand, in pcregrep and other tools, it is really the quantifier that is greedy. Is this a specific behaviour of grep? N.B. I checked both grep (BSD grep) 2.5.1-FreeBSD and grep (GNU grep) 3.1 回答1: In the description of term matched, POSIX states

How come my regex isn't working as expected in Bash? Greedy instead of Lazy

社会主义新天地 提交于 2020-02-06 07:55:48
问题 How come my regex pattern isn't lazy? It should be capturing the first number, not the second. Here is a working bash script.. #!/bin/bash text='here is some example text I want to match word1 and this number 3.01 GiB here is some extra text and another number 1.89 GiB' regex='(word1|word2).*?number[[:blank:]]([0-9.]+) GiB' if [[ "$text" =~ $regex ]]; then echo 'FULL MATCH: '"${BASH_REMATCH[0]}" echo 'NUMBER CAPTURE: '"${BASH_REMATCH[2]}" fi Here is the output... FULL MATCH: word1 and this

How come my regex isn't working as expected in Bash? Greedy instead of Lazy

守給你的承諾、 提交于 2020-02-06 07:53:26
问题 How come my regex pattern isn't lazy? It should be capturing the first number, not the second. Here is a working bash script.. #!/bin/bash text='here is some example text I want to match word1 and this number 3.01 GiB here is some extra text and another number 1.89 GiB' regex='(word1|word2).*?number[[:blank:]]([0-9.]+) GiB' if [[ "$text" =~ $regex ]]; then echo 'FULL MATCH: '"${BASH_REMATCH[0]}" echo 'NUMBER CAPTURE: '"${BASH_REMATCH[2]}" fi Here is the output... FULL MATCH: word1 and this

Odd Behavior with Greedy Modifiers Inside Capture Groups

筅森魡賤 提交于 2020-02-03 04:23:32
问题 Consider the following commands: text <- "abcdEEEEfg" sub("c.+?E", "###", text) # [1] "ab###EEEfg" <<< OKAY sub("c(.+?)E", "###", text) # [1] "ab###EEfg" <<< WEIRD sub("c(.+?)E", "###", text, perl=T) # [1] "ab###EEEfg" <<< OKAY The first does exactly what I expect, basically matching just the first E. The second one should essentially be identical to the first, since all I'm doing is adding a capturing group (though I'm not using it), yet for some reason it captures an extra E. That said, it

Why is the following regex not working in C using regcomp

我的未来我决定 提交于 2020-01-30 11:18:32
问题 I have the following regex to match the last pair of braces in a string, .+(?={)(.+)(?=}) The example string is, abc{abc=bcd}{gef=hij} I want the contents within the last braces (gef=hij) inside the captured group. This works in a regex tester available in the web http://regexpal.com/ When I use regcomp to compile the same regex, it doesnt. Any ideas? int reti = regcomp(&regex, ".+(?={)(.+)(?=})", REG_EXTENDED); if (reti) { fprintf(stderr, "Could not compile regex\n"); exit(1); } 回答1: Anyway,

Why is the following regex not working in C using regcomp

╄→гoц情女王★ 提交于 2020-01-30 11:17:06
问题 I have the following regex to match the last pair of braces in a string, .+(?={)(.+)(?=}) The example string is, abc{abc=bcd}{gef=hij} I want the contents within the last braces (gef=hij) inside the captured group. This works in a regex tester available in the web http://regexpal.com/ When I use regcomp to compile the same regex, it doesnt. Any ideas? int reti = regcomp(&regex, ".+(?={)(.+)(?=})", REG_EXTENDED); if (reti) { fprintf(stderr, "Could not compile regex\n"); exit(1); } 回答1: Anyway,

POSIX character equivalents in Java regular expressions

耗尽温柔 提交于 2020-01-13 10:33:27
问题 I would like to use a regular expression like this in Java : [[=a=][=e=][=i=]] . But Java doesn't support the POSIX classes [=a=], [=e=] etc . How can I do this? More precisely, is there a way to not use US-ASCII? 回答1: Java does support posix character classes. The syntax is just different, for instance: \p{Lower} \p{Upper} \p{ASCII} \p{Alpha} \p{Digit} \p{Alnum} \p{Punct} \p{Graph} \p{Print} \p{Blank} \p{Cntrl} \p{XDigit} \p{Space} 回答2: Quoting from http://download.oracle.com/javase/1.6.0

POSIX character equivalents in Java regular expressions

白昼怎懂夜的黑 提交于 2020-01-13 10:33:27
问题 I would like to use a regular expression like this in Java : [[=a=][=e=][=i=]] . But Java doesn't support the POSIX classes [=a=], [=e=] etc . How can I do this? More precisely, is there a way to not use US-ASCII? 回答1: Java does support posix character classes. The syntax is just different, for instance: \p{Lower} \p{Upper} \p{ASCII} \p{Alpha} \p{Digit} \p{Alnum} \p{Punct} \p{Graph} \p{Print} \p{Blank} \p{Cntrl} \p{XDigit} \p{Space} 回答2: Quoting from http://download.oracle.com/javase/1.6.0

POSIX character equivalents in Java regular expressions

淺唱寂寞╮ 提交于 2020-01-13 10:33:10
问题 I would like to use a regular expression like this in Java : [[=a=][=e=][=i=]] . But Java doesn't support the POSIX classes [=a=], [=e=] etc . How can I do this? More precisely, is there a way to not use US-ASCII? 回答1: Java does support posix character classes. The syntax is just different, for instance: \p{Lower} \p{Upper} \p{ASCII} \p{Alpha} \p{Digit} \p{Alnum} \p{Punct} \p{Graph} \p{Print} \p{Blank} \p{Cntrl} \p{XDigit} \p{Space} 回答2: Quoting from http://download.oracle.com/javase/1.6.0

Converting PCRE to POSIX regular expression

拜拜、爱过 提交于 2020-01-07 05:56:21
问题 I am working on a MySQL database and noticed that it doesn't natively support PCRE (requires a plugin). I wish to use these three for some data validation (these are actually the values given to the pattern attribute): ^[A-z\. ]{3,36} ^[a-z\d\.]{3,24}$ ^(?=^.{4,}$)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?!.*\s).*$ How do I do this? I looked on the web but couldn't find any concrete examples or answers. Also there seem to exist no utilities that could do this automatically. I am aware that some times,