问题
I'm trying to parse this XML. It's a YouTube feed. I'm working based on code in the tutorial. I want to get all the entry
nodes that are nested under the feed
.
from lxml import etree
root = etree.fromstring(text)
entries = root.xpath("/feed/entry")
print entries
For some reason entries
is an empty list. Why?
回答1:
feed
and all its children are actually in the http://www.w3.org/2005/Atom
namespace. You need to tell your xpath that:
entries = root.xpath("/atom:feed/atom:entry",
namespaces={'atom': 'http://www.w3.org/2005/Atom'})
or, if you want to change the default empty namespace:
entries = root.xpath("/feed/entry",
namespaces={None: 'http://www.w3.org/2005/Atom'})
or, if you don't want to use shorthandles at all:
entries = root.xpath("/{http://www.w3.org/2005/Atom}feed/{http://www.w3.org/2005/Atom}entry")
To my knowledge the "local namespace" is implicitly assumed for the node you're working with so that operations on children in the same namespace do not require you to set it again. So you should be able to do something along the lines of:
feed = root.find("/atom:feed",
namespaces={'atom': 'http://www.w3.org/2005/Atom'})
title = feed.xpath("title")
entries = feed.xpath("entries")
# etc...
回答2:
It's because of the namespace in the XML. Here is an explanation: http://www.edankert.com/defaultnamespaces.html#Conclusion.
来源:https://stackoverflow.com/questions/18355890/why-does-this-xpath-expression-return-an-empty-list