I am writing an application that crawls a group of my web pages. Rather than take the entire source code of the page I\'d like to take all of the content and store that and
Below function will help to remove all HTML tags, scripts, css, styles from html string and convert it to a plain text. view source
private string GetPlainTextFromHtml(string htmlString) { string htmlTagPattern = "<.*?>"; var regexCss = new Regex("(\\