问题
I am trying to parse an XML file with python using lxml, but get an error on basic attempts. I use this post and the lxml tutorials to bootstrap.
My XML file is basically built from records below (I trimmed it down so that it is easier to read):
<?xml version="1.0" ?>
<?xml-stylesheet href="file:///usr/share/nmap/nmap.xsl" type="text/xsl"?>
<nmaprun scanner="nmap" args="nmap -sV -p135,12345 -oX 10.232.0.0.16.xml 10.232.0.0/16" start="1340201347" startstr="Wed Jun 20 16:09:07 2012" version="5.21" xmloutputversion="1.03">
<host>
<hostnames>
<hostname name="host1.example.com" type="PTR"/>
</hostnames>
</host>
</nmaprun>
I run it through this complicated script:
from lxml import etree
d = etree.parse("myfile.xml")
for host in d.findall("host"):
aa = host.find("hostnames/hostname")
print aa.attrib["name"]
I get AttributeError: 'NoneType' object has no attribute 'attrib' on the print line.
I checked the value of d, host and aa and they are all defined as Elements.
Upfront apologies if this is something obvious (and it probably is).
EDIT: I added the header of the XML file as requested (I am still reading and rereading the answers :))
Thanks!
回答1:
Though it would make more sense to use XPath, your code already works fine when standing alone, so long as one handles the case where a host has no hostnames found:
doc = lxml.etree.XML("""
<nmaprun>
<host>
<hostnames>
<hostname name="host1.example.com" type="PTR"/>
</hostnames>
</host>
</nmaprun>""")
for host in doc.findall('host'):
host_el = host.find('hostnames/hostname')
if host_el is not None:
print host_el.attrib['name']
With XPath (doc.xpath() rather than doc.find() or doc.findall()), one could do better, filtering only for hostnames with a name and thus avoiding the faulty records altogether:
host[hostnames/hostname/@name]will findhosts which have at least onehostnameswith ahostnamewith a anameattribute.//hostnames/hostname/@namewill directly return only the names themselves (if usinglxml, exposing these as strings).
回答2:
You can solve this with an xpath expression.
d.xpath('//hostname/@name') # thank you for comment
Alternatively
for host in d.xpath('//hostname'):
print host.get('name'), host.get('whatever else etc...')
回答3:
It looks like you might have some <host> element that either have not <hostnames> or no <hostname> sub-element defined.
As suggested in a comment to your question by @Charles Duffy, you need to check that your call to find() found an element
for host in d.findall("host"):
aa = host.find("hostnames/hostname")
if aa:
print aa.attrib["name"]
来源:https://stackoverflow.com/questions/11123536/python-error-with-basic-xml-parsing-with-lxml