Adding attributes to existing elements, removing elements, etc with lxml

你说的曾经没有我的故事 提交于 2019-12-21 04:56:37

问题


I parse in the XML using

from lxml import etree

tree = etree.parse('test.xml', etree.XMLParser())

Now I want to work on the parsed XML. I'm having trouble removing elements with namespaces or just elements in general such as

<rdf:description><dc:title>Example</dc:title></rdf:description>

and I want to remove that entire element as well as everything within the tags. I also want to add attributes to existing elements as well. The methods I need are in the Element class but I have no idea how to use that with the ElementTree object here. Any pointers would definitely be appreciated, thanks


回答1:


You can get to the root element via this call: root=tree.getroot()

Using that root element, you can use findall() and remove elements that match your criteria:

deleteThese = root.findall("title")
for element in deleteThese: root.remove(element)

Finally, you can see what your new tree looks like with this: etree.tostring(root, pretty_print=True)

Here is some info about how find/findall work: http://infohost.nmt.edu/tcc/help/pubs/pylxml/class-ElementTree.html#ElementTree-find

To add an attribute to an element, try something like this:

root.attrib['myNewAttribute']='hello world'



回答2:


The remove method should do what you want:

>>> from lxml import etree
>>> from StringIO import StringIO

>>> s = '<Root><Description><Title>foo</Title></Description></Root>'
>>> tree = etree.parse(StringIO(s))

>>> print(etree.tostring(tree.getroot()))
<Root><Description><Title>foo</Title></Description></Root>

>>> title = tree.find('//Title')
>>> title.getparent().remove(title)
>>> etree.tostring(tree.getroot())
'<Root><Description/></Root>'

>>> print(etree.tostring(tree.getroot()))
<Root><Description/></Root>


来源:https://stackoverflow.com/questions/3232618/adding-attributes-to-existing-elements-removing-elements-etc-with-lxml

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!