问题
I need a regex for javascript for matching
"{any group of chars}" <-- where that last " is not preceeded by a \
examples:
... foo "bar" ... => "bar"
... foo"bar\"" ... => "bar\""
... foo "bar" ... goo"o"ooogle "t\"e\"st"[] => ["bar", "o", "t\"e\"st"]
The actual strings will be longer and may contain multiple matches that could also include white space or regex special chars.
I have started by trying to break down the syntax but not being strong with regex myself I got stuck pretty fast but i did get as far as matching everything except for the case where the match contains \" (i think) ...
https://regex101.com/r/sj4HXw/1
UPDATE:
More about my situation ...
This regex is to be used to "syntax highlight" strings in code blocks embedded in my blog posts so a real world example might look something like this ...
<pre id="test" class="code" data-code="csharp">
if (ConfigurationManager.AppSettings["LogSql"] == "true")
</pre>
And I am using the following javascript to achieve the highlight ..
var result = $("#test").text().replace(/"[^"\\]*(?:\\[\s\S][^"\\]*)*"/g, "<span class=\"string\">$1</span>");
$("#test").html(result);
For some reason even when the suggested answers (so far at least) are used in this context i'm getting odd results.
This works but puts the value $1 instead of the actual match for some reason.
回答1:
Simple scenario (as in OP)
The most efficient regex (that is written in accordance with the unroll-the-loop principle) you may use here is
"[^"\\]*(?:\\[\s\S][^"\\]*)*"
See the regex demo
Details:
"- match the first"[^"\\]*- 0+ chars other than"and\(?:\\[\s\S][^"\\]*)*- zer or more occurrences of:\\[\s\S]- any char ([\s\S]) with a\in front[^"\\]*- 0+ chars other than"and\
"- a closing".
Usage:
// MATCHING
var rx = /"[^"\\]*(?:\\[\s\S][^"\\]*)*"/g;
var s = ' ... foo "bar" ... goo"o"ooogle "t\\"e\\"st"[]';
var res = s.match(rx);
console.log(res);
// REPLACING
console.log(s.replace(rx, '<span>$&</span>'));
More advanced scenario
If there is an escaped " before a valid match or there are \s before a ", the approach above won't work. You will need to match those \s and capture the substring you need.
/(?:^|[^\\])(?:\\{2})*("[^"\\]*(?:\\[\s\S][^"\\]*)*")/g
^^^^^^^^^^^^^^^^^^^^^^ ^
See another regex demo.
Usage:
// MATCHING
var rx = /(?:^|[^\\])(?:\\{2})*("[^"\\]*(?:\\[\s\S][^"\\]*)*")/g;
var s = ' ... \\"foo "bar" ... goo"o"ooogle "t\\"e\\"st"[]';
var m, res=[];
while (m = rx.exec(s)) {
res.push(m[1]);
}
console.log(res);
// REPLACING
console.log(s.replace(/((?:^|[^\\])(?:\\{2})*)("[^"\\]*(?:\\[\s\S][^"\\]*)*")/g, '$1<span>$2</span>'));
The main pattern is wrapped with capturing parentheses, and this is added at the start:
(?:^|[^\\])- either start of string or any char but\(?:\\{2})*- 0+ occurrences of a double backslash.
回答2:
Prioritize the escaped characters first:
"(\\.|[^"])*"
https://regex101.com/r/sj4HXw/2
回答3:
This should do it:
"(\\[\s\S]|[^"\\])*"
It's a mixture of the other answers from Wiktor and Taufik.
来源:https://stackoverflow.com/questions/43831523/matching-quote-wrapped-strings-in-javascript-with-regex