Find comment or text nodes in a document fragment

别说谁变了你拦得住时间么 提交于 2019-12-11 04:16:44

问题


I have to clean up a Nokogiri::HTML::DocumentFragment document (remove comment nodes and text nodes which contain whitespace only). Here's an example:

html = "<p>paragraph</p><!-- comment --><p>paragraph</p>   <p>paragraph</p>"
doc = Nokogiri::HTML::DocumentFragment.parse html

The document fragment looks as you'd expect:

#(DocumentFragment:0x3fc65f9f5870 {
  name = "#document-fragment",
  children = [
    #(Element:0x3fc65f9f5064 { name = "p", children = [ #(Text "paragraph")] }),
    #(Comment " comment "),
    #(Element:0x3fc65f9f4f60 { name = "p", children = [ #(Text "paragraph")] }),
    #(Text "   "),
    #(Element:0x3fc65f9f4e48 { name = "p", children = [ #(Text "paragraph")] })
  ]
})

How can I find all comment or all text nodes in this document fragment?

The following don't work because it's not a full document but a document fragment:

doc.search('//text()')
doc.search('//comment()')

回答1:


Figured it out:

doc.search('.//text()')
doc.search('.//comment()')


来源:https://stackoverflow.com/questions/40787659/find-comment-or-text-nodes-in-a-document-fragment

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!