I would like to parse a table using Nokogiri. I\'m doing it this way
def parse_table_nokogiri(html)
doc = Nokogiri::HTML(html)
doc.search(\'table &
Use:
td//text()[normalize-space()]
This selects all non-white-space-only text node descendents of any td child of the current node (the tr already selected in your code).
Or if you want to select all text-node descendents, regardles whether they are white-space-only or not:
td//text()
UPDATE:
The OP has signaled in a comment that he is getting an unwanted td with content just a ' ' (aka non-breaking space).
To exclude also tds whose content is composed only of (one or more) nbsp characters, use:
td//text()[translate(normalize-space(), ' ', '')]