Strip text from HTML document using Ruby
问题 There are lots of examples of how to strip HTML tags from a document using Ruby, Hpricot and Nokogiri have inner_text methods that remove all HTML for you easily and quickly. What I am trying to do is the opposite, remove all the text from an HTML document, leaving just the tags and their attributes. I considered looping through the document setting inner_html to nil but then really you'd have to do this in reverse as the first element (root) has an inner_html of the entire rest of the