I\'m a programming novice and only rarely use python so please bear with me as I try to explain what I am trying to do :)
I have the following XML:
&
You could use a CssSelector to get the nodes you want from the Patient element:
from lxml.cssselect import CSSSelector
visitSelector = CSSSelector('Visit')
visits = visitSelector(child)
you can do the same to get the patientCode Tag and the SWOL28 tag
then you can access and modifiy the text of the elements using element.text
This is untested by it should be fairly close to what you want.
for patient in root:
patient_code = patient.find('PatientCharacteristics').find('patientCode')
if patient_code.text == code:
for visit in patient.find('Visits'):
visit_date = visit.find('VisitDate')
if visit_date.text == date:
swol28 = visit.find('DAS').find('Joints').find('SWOL28')
if swol28.text:
visit.find('DAS').find('Joints').set('SWOL28', new_swol28)
If you use lxml.etree
, you can use xpath
to find the elements you need to update.
E.g.
doc.xpath('Patient[PatientCharacteristics/patientCode=$patient]/Visits/Visit[VisitDate=$visit]',patient="3",visit="2009-07-10")
So
from lxml import etree
doc = etree.parse("DB3.xml")
changes = [
dict(patient='3',visit='2010-08-17',swol28="99"),
]
def update_doc(x,d):
for row in d:
for visit in x.xpath('Patient[PatientCharacteristics/patientCode=$patient]/Visits/Visit[VisitDate=$visit]',**row):
for swol28 in visit.xpath('DAS/Joints/SWOL28'):
swol28.text = row['swol28']
update_doc(doc,changes)
print etree.tostring(doc)
Should yield you something that contains:
<Patient>
<PatientCharacteristics>
<patientCode>3</patientCode>
</PatientCharacteristics>
<Visits>
<Visit>
<DAS>
<CRP>14</CRP>
<ESR/>
<Joints>
<DAS_PROFILE>28/28</DAS_PROFILE>
<SWOL28>99</SWOL28>
<TEN28>0</TEN28>
</Joints>
</DAS>
<VisitDate>2010-08-17</VisitDate>
</Visit>
</Visits>
</Patient>
You can iterate over all the "visit" tags directly under an element "element" like this:
for x in element.iter("visit"):
You can find the first direct child of element matching a certain tag with:
element.find( "visits" )
It looks like you will first have to locate the "visits" element, which is the parent of "visit", and then iterate through its "visit" children. Putting those together you'd have something like this:
for patient_element in root:
print patient_element.tag
visits_element = patient_element.find( "visits" )
for visit_element in visits_element.iter("visit"):
print visit_element.tag, visit_element.text
# ... further processing of each visit element here
In general look at the section "Finding interesting elements" in the documentation for xml.etree.ElementTree: http://docs.python.org/2/library/xml.etree.elementtree.html#finding-interesting-elements