regex-greedy

Regular expression in regards to question mark “lazy” mode

好久不见. 提交于 2019-12-04 05:48:08
问题 I understand the ? mark here means "lazy". My question essentially is [0-9]{2}? vs [0-9]{2} Are they same? If so, why are we writing the former expression? Aren't lazy mode more expensive performance wise? If not, can you tell the difference? 回答1: There is not a difference between [0-9]{2} and [0-9]{2}? . The difference between greedy matching and lazy matching (the addition of a ? ) has to do with backtracking. Regular expression engines are built to match text (from left to right).

Greediness behaving differently in JavaScript?

佐手、 提交于 2019-12-03 01:22:26
There was this question which made me realise that greediness of quantifiers is not always the same in certain regex engines. Taking the regex from that question and modifying it a bit: !\[(.*?)*\] (I know that * is redundant here, but I found what's following to be quite an interesting behaviour). And if we try to match against: ![][][] I expected to get the first capture group to be empty, because (.*?) is lazy and will stop at the first ] it comes across. This is indeed what happens in: PCRE Python but not Javascript where it matches the whole ][][ . ( jsfiddle ) I looked around with some

Matching text between delimiters: greedy or lazy regular expression?

给你一囗甜甜゛ 提交于 2019-12-02 19:15:13
For the common problem of matching text between delimiters (e.g. < and > ), there's two common patterns: using the greedy * or + quantifier in the form START [^END]* END , e.g. <[^>]*> , or using the lazy *? or +? quantifier in the form START .*? END , e.g. <.*?> . Is there a particular reason to favour one over the other? Some advantages: [^>]* : More expressive. Captures newlines regardless of /s flag. Considered quicker, because the engine doesn't have to backtracks to find a successful match (with [^>] the engine doesn't make choices - we give it only one way to match the pattern against

Regex - Greedyness - matching HTML tags, content and attributes

倖福魔咒の 提交于 2019-12-02 14:45:00
问题 I am trying to match specific span-tags from an HTML source. The lang-attribute and the inner HTML of the tag are used as parameters for a function which returns a new string. I want replace the old tags, attributes and content with the result of the called function. The subject would be something like this: <p>Some codesnippet:</p> <span lang="fsharp">// PE001 let p001 = [0..999] |> List.filter (fun n -> n % 3 = 0 || n % 5 = 0) |> List.sum </span> <p>Another code snippet:</p> <span lang="C#"

Converting a Regex Expression that works in Chrome to work in Firefox [duplicate]

和自甴很熟 提交于 2019-12-02 12:56:58
This question already has an answer here: Javascript: negative lookbehind equivalent? 13 answers I have this Regex Expression that works in chrome but doesn't not work in Firefox. SyntaxError: invalid regexp group It has something to do with lookbehinds and Firefox does not support these. I need this to work in Firefox can some one help me convert this so it works in Firefox and filters out the tags as well? return new RegExp(`(?!<|>|/|&amp|_)(?<!</?[^>]*|&[^;]*)(${term})`, 'gi'); }; searchTermsInArray.forEach(term => { if (term.length) { const regexp = this.regexpFormula(term); newQuestion

Java Regex - Ilegal Repetition character

本小妞迷上赌 提交于 2019-12-02 11:24:13
My regex is (?:--|#|\/\*|{) When i compile this using Pattern.complie() in java, I am getting * Illegal Repetitive Character * I tested this regex (a|\/\*|b) When i compiled this, It shows no error. Why does this occur ? Gábor Bakos It is because of { . It is used to specify how many times something should it be repeated. For instance x{2,4} will match x repeated 2 ( xx ), 3 ( xxx ) or 4 ( xxxx ) times. If you want regex to match { literal it needs to be escaped: (?:--|#|\/\*|\{) 来源: https://stackoverflow.com/questions/22146173/java-regex-ilegal-repetition-character

Regular expression in regards to question mark “lazy” mode

大城市里の小女人 提交于 2019-12-02 09:34:00
I understand the ? mark here means "lazy". My question essentially is [0-9]{2}? vs [0-9]{2} Are they same? If so, why are we writing the former expression? Aren't lazy mode more expensive performance wise? If not, can you tell the difference? There is not a difference between [0-9]{2} and [0-9]{2}? . The difference between greedy matching and lazy matching (the addition of a ? ) has to do with backtracking. Regular expression engines are built to match text (from left to right). Therefore it is logical that when you ask an expression to match a range of character(s), it matches as many as

Regex - Greedyness - matching HTML tags, content and attributes

别来无恙 提交于 2019-12-02 08:58:40
I am trying to match specific span-tags from an HTML source. The lang-attribute and the inner HTML of the tag are used as parameters for a function which returns a new string. I want replace the old tags, attributes and content with the result of the called function. The subject would be something like this: <p>Some codesnippet:</p> <span lang="fsharp">// PE001 let p001 = [0..999] |> List.filter (fun n -> n % 3 = 0 || n % 5 = 0) |> List.sum </span> <p>Another code snippet:</p> <span lang="C#">//C# testclass class MyClass { } </span> In order to extract the value of the lang attribute and the

Shortest match in regex from end

六月ゝ 毕业季﹏ 提交于 2019-12-01 21:39:36
Given an input string fooxxxxxxfooxxxboo I am trying to write a regex that matches fooxxxboo i.e. starting from the second foo till the last boo. I tried the following foo.*?boo matches the complete string fooxxxxxxfooxxxboo foo.*boo also matches the complete string fooxxxxxxfooxxxboo I read this Greedy vs. Reluctant vs. Possessive Quantifiers and I understand their difference, but I am trying to match the shortest string from the end which matches the regex i.e. something like the regex to be evaluated from back. Is there any way I can match only the last portion? Use negative lookahead

Need regexp to find substring between two tokens

落花浮王杯 提交于 2019-12-01 18:28:30
I suspect this has already been answered somewhere, but I can't find it, so... I need to extract a string from between two tokens in a larger string, in which the second token will probably appear again meaning... (pseudo code...) myString = "A=abc;B=def_3%^123+-;C=123;" ; myB = getInnerString(myString, "B=", ";" ) ; method getInnerString(inStr, startToken, endToken){ return inStr.replace( EXPRESSION, "$1"); } so, when I run this using expression " .+B=(.+);.+ " I get "def_3%^123+-;C=123;" presumably because it just looks for the LAST instance of ';' in the string, rather than stopping at the