问题
I'm attempting to formulate a regex in Java to capture multiple strings in a space-delimited list. Here is the string I am trying to capture from ...
String output = "regulations { qux def } standards none rules { abc-123 456-defghi wxyz_678 } security { enabled }";
And I want use a regex to match on each word in the space-delimited list between the brackets immediately following rules
. In other words, I would like the regex to match on abc-123
, 456-defghi
, and wxyz_678
. These substrings in this list can contain any characters except whitespace, and there can be any number of substrings in the list; I've just used the above 3 specifically to illustrate by example. The following isn't working since I need to modify it to be able to match multiple times ...
String regex = "rules\\s\\{\\s([^\\s]*)\\s\\}";
final Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(output);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
How could I modify the above regex to account for multiple possible matches and get the following output?
abc-123
456-defghi
wxyz_678
回答1:
Here is a 1-step approach: use 1 regex to "match them all".
The regex:
(?:\brules\s+\{|(?!^)\G)\s+([\w-]+)
The regex is matching a whole word rules
followed by 1 or more whitespace symbols and if it finds 1 or more whitespace followed by sequences of 1 or more alphanumeric symbols or hyphens, it also matches right after the last successful match. The word rules
is a kind of a boundary for us here.
Java code:
String output = "regulations { qux def } standards none rules { abc-123 456-defghi wxyz_678 } security { enabled }";
String regex = "(?:\\brules\\s+\\{|(?!^)\\G)\\s+([\\w-]+)";
final Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(output);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Here is a 2-step approach: 1) get the substring between rules {
and }
, 2) split with whitespace.
String output = "regulations { qux def } standards none rules { abc-123 456-defghi wxyz_678 } security { enabled }";
String subst = output.replaceFirst("(?s)^.*\\brules\\s*[{]\\s*([^{}]+)[}].*$", "$1");
String[] res = subst.split("\\s+");
System.out.println(Arrays.toString(res));
See IDEONE demo and the regex demo.
The regex is much simpler, it just matches all up to and including rules {
, then captures what is inside the {...}
, and then matches }
and the rest of string. With the backreference $1
we restore this Group 1 value to subst
variable. Then just split.
来源:https://stackoverflow.com/questions/34069272/matching-on-substrings-in-delimited-list-using-regex