Get text directly inside a tag in Nokogiri

痴心易碎 提交于 2019-11-30 08:15:25

To get all the direct children with text, but not any further sub-children, you can use XPath like so:

doc.xpath('//dt/text()')

Or if you wish to use search:

doc.search('dt').xpath('text()')

Using XPath to select exactly what you want (as suggested by @Casper) is the right answer.

def own_text(node)
  # Find the content of all child text nodes and join them together
  node.xpath('text()').text
end

Here's an alternative, fun answer :)

def own_text(node)
  node.clone(1).tap{ |copy| copy.element_children.remove }.text
end

Seen in action:

require 'nokogiri'
root = Nokogiri.XML('<r>hi <a>BOO</a> there</r>').root
puts root.text       #=> hi BOO there
puts own_text(root)  #=> hi  there

The dt element has two children, so you can access it by:

doc.search("dt").children.last.text
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!