问题
There are many situations in parsing user input where the user has the opportunity to add several optional flags to the input which should be accepted in any order. How can this be parsed with regex so that each flag will be in it's own capture group if it is present?
For example:
There is a required token a
, and then 3 optional tokens which can come in any order b
, c
, and d
.
Some acceptable inputs would be:
a
a b
a c
a b c
a c b
a b c d
a d b c
a c d b
The capture groups should always look like this:
0 => (anything, this is ignored)
1 => a
2 => b or null
3 => c or null
4 => d or null
There are several parts to this problem that have already been answered:
- Using the
(...)?
form to make a capture group optional - Using lookaheads
(?=.*b)(?=.*c)(?=.*d)
to allow things to be in any order
But the combination of these strategies doesn't work: (a)(?=.*(b)?)(?=.*(c)?)(?=.*(d)?)
Regex101 Test
What regex would allow optional tokens to be found in any order?
(The answer can use any flavor of regex)
回答1:
A regex that works in many flavors is:
(a)(?=(?:.*(b))?)(?=(?:.*(c))?)(?=(?:.*(d))?)
This form is modular in that adding on to it simply requires adding on another (?=(?:.*(xxx))?)
to the pattern. It works because it forces the .*
to do its backtracking, but also keeps a .*?
from quitting immediately (since the next token is can be matched immediately).
Regex101 Tested (works here in PCRE, JavaScript, and Python)
JavaScript Example: JSFiddle
var cmd = document.getElementById("cmd"),
pre = document.getElementById("output"),
reg = /(a)(?=(?:.*(b))?)(?=(?:.*(c))?)(?=(?:.*(d))?)/;
cmd.onkeyup = function() {
var m = reg.exec(cmd.value) || [],
output = "Match\n";
for (var i = 1; i < m.length; i++)
output += "[" + i + "] => " + (m[i] || "null") + "\n";
pre.innerHTML = m.length ? output : "No Match";
}
Enter command: <input id="cmd" type="text" />
<pre id="output">No Match</pre>
The combination of the two strategies in the question doesn't work because the form .*(x)?
is too greedy (it skips over the capture group). On the other hand, .*?(x)?
is too lazy (it is stops at the first index because it notices that the next item is optional).
来源:https://stackoverflow.com/questions/37449492/matching-optional-capture-groups-in-any-order