remove html node from htmldocument :HTMLAgilityPack

前端 未结 4 609
闹比i
闹比i 2020-12-16 18:17

In my code, I want to remove the img tag which doesn\'t have src value. I am using HTMLAgilitypack\'s HtmlDocument object. I am finding the img whic

4条回答
  •  星月不相逢
    2020-12-16 18:36

    It seems you're modifying the collection during the enumeration by using HtmlNode.RemoveChild method.

    To fix this you need is to copy your nodes to a separate list/array by calling e.g. Enumerable.ToList() or Enumerable.ToArray().

    var nodesToRemove = doc.DocumentNode
        .SelectNodes("//img[not(string-length(normalize-space(@src)))]")
        .ToList();
    
    foreach (var node in nodesToRemove)
        node.Remove();
    

    If I'm right, the problem will disappear.

提交回复
热议问题