问题
suppose that i have this xml file :
<article-set xmlns:ns0="http://casfwcewf.xsd" format-version="5">
<article>
<article id="11234">
<source>
<hostname>some hostname for 11234</hostname>
</source>
<feed>
<type weight=0.32>RSS</type>
</feed>
<uri>some uri for 11234</uri>
</article>
<article id="63563">
<source>
<hostname>some hostname for 63563 </hostname>
</source>
<feed>
<type weight=0.86>RSS</type>
</feed>
<uri>some uri for 63563</uri>
</article>
.
.
.
</article></article-set>
what I want, is to print each article id with its specific attribute weight in RSS for the whole document (like this).
id=11234
weight= 0.32
id=63563
weight= 0.86
.
.
.
I used this code to do so,
from lxml import etree
tree = etree.parse("C:\\Users\\Me\\Desktop\\public.xml")
for article in tree.iter('article'):
article_id = article.attrib.get('id')
for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
print(article_id,weight)
and it did not work, could someone help me with this?
回答1:
You can do it in two lines if you really want to do so.
>>> from lxml import etree
>>> tree = etree.parse('public.xml')
>>> for item in tree.xpath('.//article[@id]//type[@weight]'):
... item.xpath('../..')[0].attrib['id'], item.attrib['weight']
...
('11234', '0.32')
('63563', '0.86')
One xml checker I used insisted on double-quotes around the values for weight
. etree
croaked on the xml until I dropped the first line in the file; I don't know why.
回答2:
One of these This might work for you:
In this version, note the addition of =
in the call to tree.xpath()
:
from lxml import etree
tree = etree.parse("news.xml")
for article in tree.iter('article'):
article_id = article.attrib.get('id')
for weight in tree.xpath("//article[@id={}]/feed/type/@weight".format(article_id)):
print(article_id,weight)
Here, notice that I replaced tree.xpath()
with article.xpath()
:
from lxml import etree
tree = etree.parse("news.xml")
for article in tree.iter('article'):
article_id = article.attrib.get('id')
for weight in article.xpath("./feed/type/@weight"):
print(article_id,weight)
来源:https://stackoverflow.com/questions/44710836/getting-attribute-of-an-element-with-its-corresponding-id