Strip style attributes with nokogiri

前端 未结 3 2094
后悔当初
后悔当初 2020-12-14 03:27

I\'m scrapling an html page with nokogiri and i want to strip out all style attributes.
How can I achieve this? (i\'m not using rails so i can\'t use it\'s sanitize meth

3条回答
  •  误落风尘
    2020-12-14 03:58

    I tried the answer from Phrogz but could not get it to work (I was using a document fragment though but I'd have thought it should work the same?).

    The "//" at the start didn't seem to be checking all nodes as I would expect. In the end I did something a bit more long winded but it worked, so here for the record in case anyone else has the same trouble is my solution (dirty though it is):

    doc = Nokogiri::HTML::Document.new
    body_dom = doc.fragment( my_html )
    
    # strip out any attributes we don't want
    body_dom.xpath( './/*[@align]|*[@align]' ).each do |tag|
        tag.attributes["align"].remove
    end
    

提交回复
热议问题