Can I use variables in pattern in Regex (C#)

*爱你&永不变心* 提交于 2021-02-05 06:32:04

问题


I have some HTML-text, where I need to replace words to links on them. For example, I have text with word "PHP", and want to replace it with <a href="glossary.html#php">PHP</a>. And there are many words that I need to replace.

My code:

public struct GlossaryReplace
{
    public string word; // here the words, e.g. PHP
    public string link; // here the links to replace, e.g. glossary.html#php
}
public static GlossaryReplace[] Replaces = null;    

IHTMLDocument2 html_doc = webBrowser1.Document.DomDocument as IHTMLDocument2;
string html_content = html_doc.body.outerHTML;

for (int i = 0; i < Replaces.Length; i++)
{
    String substitution = "<a class=\"glossary\" href=\"" + Replaces[i].link + "\">" + Replaces[i].word + "</a>";
    html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);
}
html_doc.body.innerHTML = html_content;

The trouble is - this is not working :( But,

html_content = Regex.Replace(html_content, @"\bPHP\b", "some replacement");

this code works well! I can't understand my error!


回答1:


You forgot a @ here:

@"\b" + Replaces[i].word + "\b"

Should be:

@"\b" + Replaces[i].word + @"\b"

I'd also recommend that you use an HTML parser if you are modifying HTML. HTML Agility Pack is a useful library for this purpose.




回答2:


The @ prefix for strings only apply to the immediately following string, so when you concatenate strings you may have to use it on each string.

Change this:

html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + "\b", substitution);

to:

html_content = Regex.Replace(html_content, @"\b" + Replaces[i].word + @"\b", substitution);

In a regular expression \b means a word boundary, but in a string it means a backspace character (ASCII 8). You get a compiler error if you use an escape code that doesn't exist in a string (e.g. \s), but not in this case as the code exist both in strings and regular expressions.

On a side note; a method that is useful when creating regular expression patterns dynamically is the Regex.Escape method. It escapes characters in a string to be used in a pattern, so @"\b" + Regex.Escape(Replaces[i].word) + @"\b" would make the pattern work even if the word contains characters that have a special meaning in a regular expression.



来源:https://stackoverflow.com/questions/4099353/can-i-use-variables-in-pattern-in-regex-c

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!