Strip style attributes with nokogiri

前端 未结 3 2095
后悔当初
后悔当初 2020-12-14 03:27

I\'m scrapling an html page with nokogiri and i want to strip out all style attributes.
How can I achieve this? (i\'m not using rails so i can\'t use it\'s sanitize meth

3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-14 04:08

    require 'nokogiri'
    
    html = '

    bla bla

    ' doc = Nokogiri::HTML(html) doc.xpath('//@style').remove puts doc.css('.post') #=>

    bla bla

    Edited to show that you can just call NodeSet#remove instead of having to use .each(&:remove).

    Note that if you have a DocumentFragment instead of a Document, Nokogiri has a longstanding bug where searching from a fragment does not work as you would expect. The workaround is to use:

    doc.xpath('@style|.//@style').remove
    

提交回复
热议问题