I am writing an application that crawls a group of my web pages. Rather than take the entire source code of the page I\'d like to take all of the content and store that and
Please, please do not parse HTML yourself! You cannot use just a standard regex to parse HTML - it's not possible.
There are tons of free libraries out there. One of the best free ones in the world of .NET is the HTML Agility Pack.
HTML Agility Pack supports malformed documents as well, which is something that a regex or other basic parsing such as XML will almost never do.