问题
I was going through this question C#, Regex.Match whole words
It says for match whole word use "\bpattern\b" This works fine for match whole word without any special characters since it is meant for word characters only!
I need an expression to match words with special characters also. My code is as follows
class Program
{
static void Main(string[] args)
{
string str = Regex.Escape("Hi temp% dkfsfdf hi");
string pattern = Regex.Escape("temp%");
var matches = Regex.Matches(str, "\\b" + pattern + "\\b" , RegexOptions.IgnoreCase);
int count = matches.Count;
}
}
But it fails because of %. Do we have any workaround for this? There can be other special characters like 'space','(',')', etc
回答1:
If you have non-word characters then you cannot use \b
. You can use the following
@"(?<=^|\s)" + pattern + @"(?=\s|$)"
Edit: As Tim mentioned in comments, your regex is failing precisely because \b
fails to match the boundary between %
and the white-space next to it because both of them are non-word characters. \b
matches only the boundary between word character and a non-word character.
See more on word boundaries here.
Explanation
@"
(?<= # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
# Match either the regular expression below (attempting the next alternative only if this one fails)
^ # Assert position at the beginning of the string
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
)
temp% # Match the characters “temp%” literally
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
# Match either the regular expression below (attempting the next alternative only if this one fails)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
)
"
回答2:
If the pattern can contain characters that are special to Regex, run it through Regex.Escape first.
This you did, but do not escape the string that you search through - you don't need that.
回答3:
output = Regex.Replace(output, "(?<!\w)-\w+", "")
output = Regex.Replace(output, " -"".*?""", "")
来源:https://stackoverflow.com/questions/8256700/regex-expression-to-match-whole-word-with-special-characters-not-working