Get all children of specific node in Python

本秂侑毒 提交于 2020-01-25 07:30:28

问题


I have the following example.xml structure:

<ParentOne>
   <SiblingOneA>This is Sibling One A</SiblingOneA>
   <SiblingTwoA>
      <ChildOneA>Value of child one A</ChildOneA>
      <ChildTwoA>Value of child two A</ChildTwoA>
   </SiblingTwoA>
</ParentOne>

<ParentTwo>
   <SiblingOneA>This is a different value for Sibling one A</SiblingOneA>
   <SiblingTwoA>
      <ChildOneA>This is a different value for Child one A</ChildOneA>
      <ChildTwoA>This is a different value for Child Two A</ChildTwoA>
   </SiblingTwoA>
</ParentTwo>

 <ParentThree>
   <SiblingOneA>A final value for Sibling one A</SiblingOneA>
   <SiblingTwoA>
      <ChildOneA>A final value for Child one A</ChildOneA>
      <ChildTwoA>A final value for Child one A</ChildTwoA>
   </SiblingTwoA>
</ParentThree>

My main requirement is to loop through each one of the nodes and when the current node in question is "SiblingOneA", the code makes a check to see if the sibling node directly adjacent is "SiblingTwoA". If so, then it should retrieve all the children nodes (both the elements themselves, and the values within the elements).

So far, this is my code:

from lxml import etree
XMLDoc = etree.parse('example.xml')
rootXMLElement = XMLDoc.getroot()
tree = etree.parse('example.xml)
import os

for Node in XMLDoc.xpath('//*'):
   if os.path.basename(XMLDoc.getpath(Node)) == "SiblingOneA":
      if Node.getnext() is not None:
         if Node.getnext().tag == "SiblingTwoA":
            #RETRIEVE ALL THE CHILDREN ELEMENTS OF THAT SPECIFIC SiblingTwoA NODE AND THEIR VALUES

As you may have deduced from my above code, I do not know what to put in place of the comment to retrieve all the children elements and values of the "SiblingTwoA" node. Also, this code should not return all the children elements of the SiblingTwoA nodes in the whole tree structure, but just of the one in question (i.e. the one returned from the Node.getnext() element). You will also have noticed that many of the elements are the same, but their values are different.

EDIT:

I have been able to retrieve the children of the element in question using Node.getnext().getchildren(). However, this returns the information in the form of a list, such as:

[<Element ChildOneA at 0x101a95870>, <Element ChildTwoA at 0x101a958c0>]
[<Element ChildOneA at 0x101a95a50>, <Element ChildTwoA at 0x101a95aa0>]
[<Element ChildOneA at 0x101a95c30>, <Element ChildTwoA at 0x101a95c80>]

How can I retrieve the actual values within the elements?

My desired output, for the first iteration for example, would be something like:

ChildOneA = Value of child one A

ChildTwoA = Value of child two A


回答1:


I think to generate a simple list (['Value of child one A', 'Value of child two A', 'This is a different value for Child one A', 'This is a different value for Child Two A', 'A final value for Child one A', 'A final value for Child one A']) you can use

[child.xpath('string()') for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]') for child in sibling.xpath('*')]

to generate a nested list ([['Value of child one A', 'Value of child two A'], ['This is a different value for Child one A', 'This is a different value for Child Two A'], ['A final value for Child one A', 'A final value for Child one A']]) you can use

[[child.xpath('string()') for child in sibling.xpath('*')] for sibling in doc.xpath('//SiblingTwoA[preceding-sibling::*[1][self::SiblingOneA]]')]


来源:https://stackoverflow.com/questions/59716073/get-all-children-of-specific-node-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!