I\'ve spent some time, but still have to solution. I need regular expression that is able to match a words with signs in it (like c++) in string.
I\'ve used /\
Plus sign have special meaning so you will have to escape it with \
. The same rule applies to these characters: \, *, +, ?, |, {, [, (,), ^, $,., #,
and white space
UPDATE: the problem was with \b
sequence
As the others said, your problem isn't the +
sign you've escaped correctly but the \b
that is a zero-lenght char that match word boundary that takes place between word \w
and non-word \W
char.
There is also another mistake in your regex, you want to match char C
(uppercase) with c++
(lowercase).To do so you have to change your regex to /\bc\+\+/
or use the i
modifier to match case insensitive : /\bc\+\+/i
+
is a special character so you need to escape it
\bC\+\+(?!\w)
Note that we can't use \b
because +
is not a word-character.
If you want to match a c++
between non-word chars (chars other than letters, digits and underscores) you may use
\bc\+\+\B
See the regex demo where \b
is a word boundary and \B
matches all positions that are not word boundary positions.
C# syntax:
var pattern = @"\bc\+\+\B";
You must remember that \b
/ \B
are context dependent: \b
matches between the start/end of string and the adjoining word char or between a word and a non-word chars, while \B
matches between the start/end of string and the adjoining *non-*word char or between two word or two non-word chars.
If you build the pattern dynamically, it is hard to rely on word boundary \b
pattern.
Use (?<!\w)
and (?!\w)
lookarounds instead, they will always match a word not immediately preceded/followed with a word char:
var pattern = $@"(?<!w){Regex.Escape(word)}(?!\w)";
If the word boundaries you want to match are whitespace boundaries (i.e. the match is expected only between whitespaces), use
var pattern = $@"(?<!S){Regex.Escape(word)}(?!\S)";
The problem isn't with the plus character, that you've escaped correctly, but the \b
sequence. It indicates a word boundary, which is a point between a word character (alphanumeric) and something else. Plus isn't a word character, so for \b
to match, there would need to be a word character directly after the last plus sign.
\bC\+\+\b
matches "Test C++Test" but not "Test C++ Test" for example. Try something like \bC\+\+\s
if you expect there to be a whitespace after the last plus sign.