HTML agility pack - removing unwanted tags without removing content?

前端 未结 5 2219
忘了有多久
忘了有多久 2020-11-29 03:00

I\'ve seen a few related questions out here, but they don’t exactly talk about the same problem I am facing.

I want to use the HTML Agility Pack to remove unwa

5条回答
  •  春和景丽
    2020-11-29 03:27

    How to recursively remove a given list of unwanted html tags from an html string

    I took @mathias answer and improved his extension method so that you can supply a list of tags to exclude as a List (e.g. {"a","p","hr"}). I also fixed the logic so that it works recursively properly:

    public static string RemoveUnwantedHtmlTags(this string html, List unwantedTags)
        {
            if (String.IsNullOrEmpty(html))
            {
                return html;
            }
    
            var document = new HtmlDocument();
            document.LoadHtml(html);
    
            HtmlNodeCollection tryGetNodes = document.DocumentNode.SelectNodes("./*|./text()");
    
            if (tryGetNodes == null || !tryGetNodes.Any())
            {
                return html;
            }
    
            var nodes = new Queue(tryGetNodes);
    
            while (nodes.Count > 0)
            {
                var node = nodes.Dequeue();
                var parentNode = node.ParentNode;
    
                var childNodes = node.SelectNodes("./*|./text()");
    
                if (childNodes != null)
                {
                    foreach (var child in childNodes)
                    {
                        nodes.Enqueue(child);                       
                    }
                }
    
                if (unwantedTags.Any(tag => tag == node.Name))
                {               
                    if (childNodes != null)
                    {
                        foreach (var child in childNodes)
                        {
                            parentNode.InsertBefore(child, node);
                        }
                    }
    
                    parentNode.RemoveChild(node);
    
                }
            }
    
            return document.DocumentNode.InnerHtml;
        }
    

提交回复
热议问题