regex-greedy | 易学教程

why is this regular expression returning only one match?

阅读更多关于 why is this regular expression returning only one match?

问题 Here is my input: xxx999xxx888xxx777xxx666yyy xxx222xxx333xxx444xxx555yyy This is the expression: xxx.*xxx(?<matchString>(.(?!xxx.*xxx))*?)xxx.*yyy It's returning 444 . I'd like it to return both 444 and 777, but I can't get anywhere with this. I have the ! exclusion so that it matches only the innermost on the left side (which works great when I am searching for only one result, which is most of the time). However, I have a feeling that that is related to why it is skipping the first result

Return only one group with OR condition in Regex

阅读更多关于 Return only one group with OR condition in Regex

问题 I have to write a Regex to fetch Email Address from a sentence. I want it to be returned with Group 1 only. Regex: \[mailto:(.+)\]|<(.+@.+\..+)> Input String: Hello my Email Address is <foo@hotmail.com> - Return foo@hotmail.com as Group1. Hello my Email Address is [mailto: foo@hotmail.com] - Return foo@hotmail.com as Group2. I want if any of the string matches then it should be returned in Group1. Is there any way to do this? 回答1: You may use regular expression: (?=\S+@)([^<\s]+@.*(?=[>\]]))

Dreamweaver Regex Find and Replace with Regular Expression

阅读更多关于 Dreamweaver Regex Find and Replace with Regular Expression

问题 I have many instances throughout my site where I have inadvertently included the following code: if (isset(htmlspecialchars($_GET['u']))) I need to do a widespread find/replace to turn that code into this: if (isset($_GET['u'])) I am trying to use the code below to find with Regular Expressions, but it only comes up if I don't include the htmlspecialchars and parentheses. Find: htmlspecialchars(\$_GET['([^']*)']) Replace: $_GET['$1'] Any ideas? Thanks! 回答1: () and [] need to be escaped. isset

RegEx for capturing a repeating pattern

阅读更多关于 RegEx for capturing a repeating pattern

问题 I have the following regex from regex capturing with repeating pattern ([0-9]{1,2}h)[ ]*([0-9]{1,2}min):[ ]*(.*(?:\n(?![0-9]{1,2}h).*)*) It takes the following string 1h 30min: Title - Description Line 1 1h 30min: Title - Description Line 1 - Description Line 2 - Description Line 3 And produces this as a result Match 1: "1h 30min: Title - Description Line 1" Group 1: "1h" Group 2: "30min" Group 3: "Title - Description Line 1" Match 2: "1h 30min: Title - Description Line 1 - Description Line 2

R: gsub and capture

阅读更多关于 R: gsub and capture

问题 I am trying to extract the contents between square brackets from a string: eq <- "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]" I can filter them out: gsub("\\[.+?\\]","" ,eq) ##replaces square brackets and everything inside it [1] "(5) h + nadh + q10 --> (4) h + nad + q10h2" But how can I capture what's inside the brackets? I tried the following: gsub("\\[(.+)?\\])", "\\1", eq) grep("\\[(.+)?\\]", eq, value=TRUE) but both return me the whole string: [1] "(5) h[m] + nadh[m] +

Regex not being greedy enough

阅读更多关于 Regex not being greedy enough

问题 I've got the following regex that was working perfectly until a new situation arose ^.*[?&]U(?:RL)?=(?<URL>.*)$ Basically, it's used against URLs, to grab EVERYTHING after the U=, or URL= and return it in the URL match So, for the following http://localhost?a=b&u=http://otherhost?foo=bar URL = http://otherhost?foo=bar Unfortunately an odd case came up http://localhost?a=b&u=http://otherhost?foo=bar&url=http://someotherhost Ideally, I want URL to be "http://otherhost?foo=bar&url=http:/

Ignoring an optional suffix with a greedy regex

阅读更多关于 Ignoring an optional suffix with a greedy regex

问题 I'm performing regex matching in .NET against strings that look like this: 1;#Lists/General Discussion/Waffles Win 2;#Lists/General Discussion/Waffles Win/2_.000 3;#Lists/General Discussion/Waffles Win/3_.000 I need to match the URL portion without the numbers at the end, so that I get this: Lists/General Discussion/Waffles Win This is the regex I'm trying: (?:\d+;#)(?<url>.+)(?:/\d+_.\d+)* The problem is that the last group is being included as part of the middle group's match. I've also

Regular expression greedy match not working as expected

阅读更多关于 Regular expression greedy match not working as expected

问题 I have a very basic regular expression that I just can't figure out why it's not working so the question is two parts. Why does my current version not work and what is the correct expression. Rules are pretty simple: Must have minimum 3 characters. If a % character is the first character must be a minimum of 4 characters. So the following cases should work out as follows: AB - fail ABC - pass ABCDEFG - pass % - fail %AB - fail %ABC - pass %ABCDEFG - pass %%AB - pass The expression I am using

Regex: Is Lazy Worse?

阅读更多关于 Regex: Is Lazy Worse?

问题 I have always written regexes like this <A HREF="([^"]*)" TARGET="_blank">([^<]*)</A> but I just learned about this lazy thing and that I can write it like this <A HREF="(.*?)" TARGET="_blank">(.*?)</A> is there any disadvantage to using this second approach? The regex is definitely more compact (even SO parses it better). Edit : There are two best answers here, which point out two important differences between the expressions. ysth's answer points to a weakness in the non-greedy/lazy one, in

Is it better to use a non-greedy qualifier or a lookahead?

阅读更多关于 Is it better to use a non-greedy qualifier or a lookahead?

问题 I have a possibly large block of text to search for instances of [[...]] , where the ... can be anything, including other brackets (though they cannot be nested; the first instance of ]] after [[ ends the match). I can think of two ways to match this text: Using a non-greedy qualifier: /\[\[.+?\]\]/ Using a lookahead: /\[\[(?:(?!\]\]).)+\]\]/ Is one choice inherently better than the other, from a performance standpoint (I'd say the first is probably more readable)? I recall reading that it's