问题
Is there a regular expression to match any line comment, but avoiding the comments inside strings?
I need all content in a line after //
(with the //
included)
For example:
//Comment (match!)
bla bla bla bla //Comment (match!)
this string "foo // foo" (don't match because it's inside "")
回答1:
The following regular expression will correctly match any strings and regular expressions in the input:
var strings = /("((.|\\\n)*?([^\\"]|\\\\)|)"|'((.|\\\n)*?([^\\']|\\\\)|)'|\/[^*](.*([^\\\/]|\\\\))\/|\/\*\/)/g;
You can remove strings from the input and then match comments using another regular expression:
var comments = /((\/\/)(.*)|(\/\*)((.|\n)*)(\*\/))/g;
input.replace(strings, "").match(comments);
var strings = /("((.|\\\n)*?([^\\"]|\\\\)|)"|'((.|\\\n)*?([^\\']|\\\\)|)'|\/[^*](.*([^\\\/]|\\\\))\/|\/\*\/)/g,
comments = /((\/\/)(.*)|(\/\*)((.|\n)*)(\*\/))/g;
function update() {
var arr = input.value.replace(strings, "").match(comments);
output.value = arr ? arr.join("\n") : "";
}
input.onkeydown = input.onkeyup = input.onchange = update;
update();
textarea {
width: 90%;
height: 5em;
}
<p>Input:</p>
<textarea id="input">
//Comment (match!)
bla bla bla bla //Comment (match!)
this string "foo // foo"
</textarea>
<p>Output:</p>
<textarea id="output">
</textarea>
回答2:
This regex will work in all cases (see regex101 example):
(("[^"]*){2})*(\/\/.*)
You want anything matched by the third capture group. Alternately, you could make the first two groups non-capturing.
It works by skipping any even number of quotes followed by other text, before hitting double slashes.
回答3:
^[^"]*(//.*)
Will not catch all cases but at least your examples should work
Update: A ^
was missing at the beginning.
回答4:
Here's another solution that should catch every single-line comment (see it work on regex101):
(\/\/.*)|"(?:\\"|.)*?"
All the comments will be captured in the first match group.
It will work in any regex flavor that has lazy quantifiers, which is almost all of them. The technique I used is to match quoted strings specifically so they get "removed" from the text available to match what we want: comments. This technique is explained in detail on RexEgg.com as The Greatest Regex Trick Ever.
Breakdown:
(\/\/.*)
matches comments, and captures in group
"(?:\\"|.)*?"
matches quoted strings, avoiding any escaped quotes inside
- The inside non-capturing group
(?:\\"|.)
matches an escaped quote OR the next character, successfully passing right over the escaped quotes rather than having them match as a "real" quote - The whole alternation has the
*?
lazy quantifier so it hits the next "real" quote, rather than proceeding to another quoted string.
来源:https://stackoverflow.com/questions/27534037/regex-to-source-code-comments