Why does this xpath expression return an empty list?

问题

I'm trying to parse this XML. It's a YouTube feed. I'm working based on code in the tutorial. I want to get all the entry nodes that are nested under the feed.

from lxml import etree
root = etree.fromstring(text)
entries = root.xpath("/feed/entry")
print entries

For some reason entries is an empty list. Why?

回答1:

feed and all its children are actually in the http://www.w3.org/2005/Atom namespace. You need to tell your xpath that:

entries = root.xpath("/atom:feed/atom:entry", 
                     namespaces={'atom': 'http://www.w3.org/2005/Atom'})

or, if you want to change the default empty namespace:

entries = root.xpath("/feed/entry", 
                     namespaces={None: 'http://www.w3.org/2005/Atom'})

or, if you don't want to use shorthandles at all:

entries = root.xpath("/{http://www.w3.org/2005/Atom}feed/{http://www.w3.org/2005/Atom}entry")

To my knowledge the "local namespace" is implicitly assumed for the node you're working with so that operations on children in the same namespace do not require you to set it again. So you should be able to do something along the lines of:

feed = root.find("/atom:feed",
                     namespaces={'atom': 'http://www.w3.org/2005/Atom'})

title = feed.xpath("title")
entries = feed.xpath("entries")
# etc...

回答2:

It's because of the namespace in the XML. Here is an explanation: http://www.edankert.com/defaultnamespaces.html#Conclusion.

来源：https://stackoverflow.com/questions/18355890/why-does-this-xpath-expression-return-an-empty-list

标签

python

xml

lxml