How do I use C# regular expression to replace/remove all HTML tags, including the angle brackets? Can someone please help me with the code?
The question is too broad to be answered definitively. Are you talking about removing all tags from a real-world HTML document, like a web page? If so, you would have to:
That's just off the top of my head--I'm sure there's more. Once you've done all that, you'll end up with words, sentences and paragraphs run together in some places, and big chunks of useless whitespace in others.
But, assuming you're working with just a fragment and you can get away with simply removing all tags, here's the regex I would use:
@"(?>?\w+)(?>(?:[^>'""]+|'[^']*'|""[^""]*"")*)>"
Matching single- and double-quoted strings in their own alternatives is sufficient to deal with the problem of angle brackets in attribute values. I don't see any need to explicitly match the attribute names and other stuff inside the tag, like the regex in Ryan's answer does; the first alternative handles all of that.
In case you're wondering about those (?>...) constructs, they're atomic groups. They make the regex a little more efficient, but more importantly, they prevent runaway backtracking, which is something you should always watch out for when you mix alternation and nested quantifiers as I've done. I don't really think that would be a problem here, but I know if I don't mention it, someone else will. ;-)
This regex isn't perfect, of course, but it's probably as good as you'll ever need.