I'm trying to write a regular expression that checks ahead to make sure there is either a white space character OR an opening parentheses after the words I'm searching for.
Also, I want it to look back and make sure it is preceded by either a non-Word (\W) or nothing at all (i.e. it is the beginning of the statement).
So far I have,
"(\\W?)(" + words.toString() + ")(\\s | \\()"
However, this also matches the stuff at either ends - I want this pattern to match ONLY the word itself - not the stuff around it.
I'm using Java flavor Regex.
As you tagged your question yourself, you need lookarounds:
String regex = "(?<=\\W|^)(" + Pattern.quote(words.toString()) + ")(?= |[(])"
(?<=X)means "preceded by X"(?<!=X)means "not preceded by X"(?=X)means "followed by X"(?!=X)means "not followed by X"
What about the word itself: will it always start with a word character (i.e., one that matches \w)? If so, you can use a word boundary for the leading condition.
"\\b" + theWord + "(?=[\\s(])"
Otherwise, you can use a negative lookbehind:
"(?<!\\w)" + theWord + "(?=[\\s(])"
I'm assuming the word is either quoted like so:
String theWord = Pattern.quote(words.toString());
...or doesn't need to be.
If you don't want a group to be captured by the matching, you can use the special construct (?:X)
So, in your case:
"(?:\\W?)(" + words.toString() + ")(?:\\s | \\()"
You will only have two groups then, group(0) for the whole string and group(1) for the word you are looking for.
来源:https://stackoverflow.com/questions/16840452/regex-lookahead-and-lookbehinds-followed-by-this-or-that