Removing child elements in XML using python

淺唱寂寞╮ 提交于 2019-12-24 03:06:40

问题


Python 3.2.5 x64 ElementTree

I have data that I need to format using python. Essentially I have file with elements and subelements. I need to delete the child elements of some of these elements. I have checked previous questions and I couldn't make a solution. The best I had so far only removes every second child element.

Sample data:

<Leg1:MOR oCount="7" xmlns:Leg1="http://what.not">
    <Leg1:Order>
        <Leg1:CTemp id="FO">
            <Leg1:Group bNum="001" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
                <Leg1:Group bNum="002" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
        </Leg1:CTemp>
        <Leg1:CTemp id="GO">
            <Leg1:Group bNum="001" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
            <Leg1:Group bNum="002" cCount="4">
                <Leg1:Dog ndate="112" pdate="111"/>
                <Leg1:Dog ndate="122" pdate="121"/>
                <Leg1:Dog ndate="132" pdate="131"/>
                <Leg1:Dog ndate="142" pdate="141"/>
            </Leg1:Group>
        </Leg1:CTemp>
    </Leg1:Order>
</Leg1:MOR>

What I need the output to look like:

<Leg1:MOR oCount="7" xmlns:Leg1="http://what.not">
    <Leg1:Order>
        <Leg1:CTemp id="FO">
            <Leg1:Group bNum="001" cCount="10"/>
            <Leg1:Group bNum="002" cCount="10"/>
        </Leg1:CTemp>
        <Leg1:CTemp id="GO">
            <Leg1:Group bNum="001" cCount="10"/>
            <Leg1:Group bNum="002" cCount="10"/>
        </Leg1:CTemp>
    </Leg1:Order>
</Leg1:MOR>

I haven't written anything in a while and my code is useless. I can parse the file, and write it I cannot get the processing right.

import xml.etree.cElementTree as ET
tree = ET.parse("input.xml")
root = tree.getroot()
for x in root.findall('./Order/CTemp/Group'):
    root.remove(x)
tree.write("output.xml")

How do I get it remove the Dog children of the CTemp elements?


回答1:


If you can use lxml, try this:

import lxml.etree

tree = lxml.etree.parse("leg.xml")
for dog in tree.xpath("//Leg1:Dog",
                      namespaces={"Leg1": "http://what.not"}):
    parent = dog.xpath("..")[0]
    parent.remove(dog)
    parent.text = None
tree.write("leg.out.xml")

Now leg.out.xml looks like this:

<?xml version="1.0"?>
<Leg1:MOR xmlns:Leg1="http://what.not" oCount="7">
  <Leg1:Order>
    <Leg1:CTemp id="FO">
      <Leg1:Group bNum="001" cCount="4"/>
      <Leg1:Group bNum="002" cCount="4"/>
    </Leg1:CTemp>
    <Leg1:CTemp id="GO">
      <Leg1:Group bNum="001" cCount="4"/>
      <Leg1:Group bNum="002" cCount="4"/>
    </Leg1:CTemp>
  </Leg1:Order>
</Leg1:MOR>


来源:https://stackoverflow.com/questions/30210515/removing-child-elements-in-xml-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!