The fragment below doesn\'t work for me.
fragment = Regex.Replace(fragment, \"\", String.Empty , RegexOptions.Multiline );
This is the top Google result for stripping comments via C#, and here's my HtmlAgilityPack code for doing this.
HtmlDocument doc = new HtmlDocument
{
OptionFixNestedTags = true,
OptionOutputAsXml = true
};
doc.LoadHtml(str);
// Script comments from the document.
if (doc.DocumentNode != null)
{
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//comment()");
if (nodes != null)
{
foreach (HtmlNode node in from cmt in nodes
where (cmt != null
&& cmt.InnerText != null
&& !cmt.InnerText.ToUpper().StartsWith("DOCTYPE"))
&& cmt.ParentNode != null
select cmt)
{
node.ParentNode.RemoveChild(node);
}
}
}
This works correctly at stripping comments, and ignores the doctype which is treated as a comment by HtmlAgilityPack.
While regex does work in controlled conditions. If you're processing HTML from the wild web then I'd recommend using HtmlAgilityPack. The HTML that is out there is very unpredictable, and regex will break.