How to add non-escaped ampersands to HTML with Nokogiri::XML::Builder

*爱你&永不变心* 提交于 2019-12-02 00:36:19

问题


I would like to add things like bullet points "•" to HTML using the XML Builder in Nokogiri, but everything is being escaped. How do I prevent it from being escaped?

I would like the result to be:

<span>&#8226;</span> 

rather than:

<span>&amp;#8226;</span> 

I'm just doing this:

xml.span { 
  xml.text "&#8226;\ " 
}

What am I missing?


回答1:


If you define

  class Nokogiri::XML::Builder
    def entity(code)
      doc = Nokogiri::XML("<?xml version='1.0'?><root>&##{code};</root>")
      insert(doc.root.children.first)
    end
  end

then this

  builder = Nokogiri::XML::Builder.new do |xml|
    xml.span {
      xml.text "I can has "
      xml.entity 8665
      xml.text " entity?"
    }
  end
  puts builder.to_xml

yields

<?xml version="1.0"?>
<span>I can has &#x2022; entity?</span>

 

PS this a workaround only, for a clean solution please refer to the libxml2 documentation (Nokogiri is built on libxml2) for more help. However, even these folks admit that handling entities can be quite ..err, cumbersome sometimes.




回答2:


When you're setting the text of an element, you really are setting text, not HTML source. < and & don't have any special meaning in plain text.

So just type a bullet: '•'. Of course your source code and your XML file will have to be using the same encoding for that to come out right. If your XML file is UTF-8 but your source code isn't, you'd probably have to say '\xe2\x80\xa2' which is the UTF-8 byte sequence for the bullet character as a string literal.

(In general non-ASCII characters in Ruby 1.8 are tricky. The byte-based interfaces don't mesh too well with XML's world of all-text-is-Unicode.)



来源:https://stackoverflow.com/questions/1811856/how-to-add-non-escaped-ampersands-to-html-with-nokogirixmlbuilder

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!