How to remove all attributes of the specific elements througout the document. I'm trying something like this:
from bs4 import UnicodeDammit
from lxml import html
content = open("source.html").read()
document = UnicodeDammit(content, is_html=True)
parser = html.HTMLParser(encoding=document.original_encoding)
root = html.document_fromstring(content, parser=parser)
for attr in root.xpath('.//table/@*'):
del attr.attrib
Here I'm trying to delete all attributes from all tables in the document using xpath, but it doesn't work.
This is one possible way, assuming that you want to remove all attributes of certain element, say table
:
for table in root.xpath('//table[@*]'):
table.attrib.clear()
The code above loop through all table
that contains any attribute, then call clear()
method of the elemet's attrib
property, since the property is simply a python dictionary.
来源:https://stackoverflow.com/questions/34285348/how-to-remove-all-attributes-from-element