https://github.com/jamietre/CsQuery
have you tried CsQuery? Though not being maintained actively - it's still my favorite for parsing HTML to Text. Here's a one liner of how simple it is to get the Text from HTML.
var text = CQ.CreateDocument(htmlText).Text();
Here's a complete console application:
using System;
using CsQuery;
public class Program
{
public static void Main()
{
var html = "Hello World
some text inside h1 tag under p tag
";
var text = CQ.CreateDocument(html).Text();
Console.WriteLine(text); // Output: Hello World some text inside h1 tag under p tag
}
}
I understand that OP has asked for HtmlAgilityPack only but CsQuery is another unpopular and one of the best solutions I've found and wanted to share if someone finds this helpful. Cheers!