Python version 2.7: XML ElementTree: How to iterate through certain elements of a child element in order to find a match

匿名 (未验证) 提交于 2019-12-03 02:20:02

问题:

I'm a programming novice and only rarely use python so please bear with me as I try to explain what I am trying to do :)

I have the following XML:

<?xml version = "1.0" encoding = "utf-8"?> <Patients>     <Patient>                <PatientCharacteristics>                    <patientCode>3</patientCode>                </PatientCharacteristics>                <Visits>                    <Visit>                           <DAS>                                <CRP>14</CRP>                                <ESR/>                                <Joints>                                        <DAS_PROFILE>28/28</DAS_PROFILE>                                        <SWOL28>20</SWOL28>                                        <TEN28>20</TEN28>                                </Joints>                           </DAS>                           <VisitDate>2010-02-17</VisitDate>                    </Visit>                    <Visit>                           <DAS>                                <CRP>10</CRP>                                <ESR/>                                <Joints>                                        <DAS_PROFILE>28/28</DAS_PROFILE>                                        <SWOL28>15</SWOL28>                                        <TEN28>20</TEN28>                                </Joints>                           </DAS>                           <VisitDate>2010-02-10</VisitDate>                    </Visit>                </Visits>     </Patient>     <Patient>         <PatientCharacteristics>                    <patientCode>3</patientCode>         </PatientCharacteristics>                <Visits>                    <Visit>                           <DAS>                                <CRP>14</CRP>                                <ESR/>                                <Joints>                                        <DAS_PROFILE>28/28</DAS_PROFILE>                                        <SWOL28>34</SWOL28>                                        <TEN28>0</TEN28>                                </Joints>                           </DAS>                           <VisitDate>2010-08-17</VisitDate>                    </Visit>                    <Visit>                           <DAS>                                <CRP>10</CRP>                                <ESR/>                                <Joints>                                        <DAS_PROFILE>28/28</DAS_PROFILE>                                        <SWOL28></SWOL28>                                        <TEN28>2</TEN28>                                </Joints>                           </DAS>                           <VisitDate>2010-07-10</VisitDate>                    </Visit>                    <Visit>                           <DAS>                                <CRP>9</CRP>                                <ESR/>                                <Joints>                                        <DAS_PROFILE>28/28</DAS_PROFILE>                                        <SWOL28>56</SWOL28>                                        <TEN28>6</TEN28>                                </Joints>                           </DAS>                           <VisitDate>2009-07-10</VisitDate>                    </Visit>                </Visits>      </Patient> </Patients> 

All I want to do here is update certain 'SWOL28' values if they match the patientCode and VisitDate that I have stored in a text file. As I understand, elementtree does not include a parent reference, as if it did, I could just use findall() from the root and work backwards from there. As it stands here is my psuedocode:

  1. For each line in the text file:
  2. Put Visit_Date Patient_Code New_SWOL28 into variables
  3. For each patient element:
  4. If patientCode = Patient_Code
  5. For each Visit element:
  6. If VisitDate = Visit_Date
  7. If SWOL28 element exists for this visit
  8. Update SWOL28 to New_SWOL28

But I am stuck at step number 5. How do I get a list of visits to iterated through? Apologies if this is a very dumb question but I have searched high and low for an answer I assure you! I have stripped down my code to the bare example of the part I need to fix below:

import xml.etree.ElementTree as ET tree = ET.parse('DB3.xml') root = tree.getroot() for child in root: # THIS GETS ME ALL THE PATIENT ATTRIBUTES     print child.tag      for x in child/Visit: # THIS IS WHAT I CANNOT FIND THE CORRECT SYNTAX FOR         # I WOULD THEN PERFORM STEPS 6, 7 AND 8 HERE 

I would be deeply appreciative of any ideas any of you may have on this. I am not a programming natural that's for sure!

Thanks in advance, Sarah

Edit 1:

On the advice of SVK below I tried the following:

import xml.etree.ElementTree as ET tree = ET.parse('Untitled.xml') root = tree.getroot() for child in root:     print child.tag      child.find( "visits" )     for x in child.iter("visit"):         print x.tag, x.text 

But the only output I get is: Patient Patient and none of the lower tags. Any ideas?

回答1:

This is untested by it should be fairly close to what you want.

for patient in root:     patient_code =  patient.find('PatientCharacteristics').find('patientCode')     if patient_code.text == code:             for visit in patient.find('Visits'):                     visit_date = visit.find('VisitDate')                     if visit_date.text == date:                         swol28 = visit.find('DAS').find('Joints').find('SWOL28')                         if swol28.text:                             visit.find('DAS').find('Joints').set('SWOL28', new_swol28) 


回答2:

You can iterate over all the "visit" tags directly under an element "element" like this:

for x in element.iter("visit"): 

You can find the first direct child of element matching a certain tag with:

element.find( "visits" ) 

It looks like you will first have to locate the "visits" element, which is the parent of "visit", and then iterate through its "visit" children. Putting those together you'd have something like this:

for patient_element in root:     print patient_element.tag      visits_element = patient_element.find( "visits" )     for visit_element in visits_element.iter("visit"):         print visit_element.tag, visit_element.text         # ... further processing of each visit element here 

In general look at the section "Finding interesting elements" in the documentation for xml.etree.ElementTree: http://docs.python.org/2/library/xml.etree.elementtree.html#finding-interesting-elements



回答3:

You could use a CssSelector to get the nodes you want from the Patient element:

from lxml.cssselect import CSSSelector visitSelector = CSSSelector('Visit') visits =  visitSelector(child) 

you can do the same to get the patientCode Tag and the SWOL28 tag then you can access and modifiy the text of the elements using element.text



回答4:

If you use lxml.etree, you can use xpath to find the elements you need to update.

E.g.

doc.xpath('Patient[PatientCharacteristics/patientCode=$patient]/Visits/Visit[VisitDate=$visit]',patient="3",visit="2009-07-10") 

So

from lxml import etree  doc = etree.parse("DB3.xml")  changes = [   dict(patient='3',visit='2010-08-17',swol28="99"), ]  def update_doc(x,d):   for row in d:     for visit in x.xpath('Patient[PatientCharacteristics/patientCode=$patient]/Visits/Visit[VisitDate=$visit]',**row):       for swol28 in visit.xpath('DAS/Joints/SWOL28'):         swol28.text = row['swol28']  update_doc(doc,changes)  print etree.tostring(doc) 

Should yield you something that contains:

<Patient>   <PatientCharacteristics>     <patientCode>3</patientCode>   </PatientCharacteristics>   <Visits>     <Visit>       <DAS>       <CRP>14</CRP>       <ESR/>       <Joints>         <DAS_PROFILE>28/28</DAS_PROFILE>         <SWOL28>99</SWOL28>         <TEN28>0</TEN28>       </Joints>     </DAS>     <VisitDate>2010-08-17</VisitDate>     </Visit>   </Visits> </Patient> 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!