Why does this xpath expression return an empty list?

孤街醉人 提交于 2019-12-10 22:22:21

问题


I'm trying to parse this XML. It's a YouTube feed. I'm working based on code in the tutorial. I want to get all the entry nodes that are nested under the feed.

from lxml import etree
root = etree.fromstring(text)
entries = root.xpath("/feed/entry")
print entries

For some reason entries is an empty list. Why?


回答1:


feed and all its children are actually in the http://www.w3.org/2005/Atom namespace. You need to tell your xpath that:

entries = root.xpath("/atom:feed/atom:entry", 
                     namespaces={'atom': 'http://www.w3.org/2005/Atom'})

or, if you want to change the default empty namespace:

entries = root.xpath("/feed/entry", 
                     namespaces={None: 'http://www.w3.org/2005/Atom'})

or, if you don't want to use shorthandles at all:

entries = root.xpath("/{http://www.w3.org/2005/Atom}feed/{http://www.w3.org/2005/Atom}entry")

To my knowledge the "local namespace" is implicitly assumed for the node you're working with so that operations on children in the same namespace do not require you to set it again. So you should be able to do something along the lines of:

feed = root.find("/atom:feed",
                     namespaces={'atom': 'http://www.w3.org/2005/Atom'})

title = feed.xpath("title")
entries = feed.xpath("entries")
# etc...



回答2:


It's because of the namespace in the XML. Here is an explanation: http://www.edankert.com/defaultnamespaces.html#Conclusion.



来源:https://stackoverflow.com/questions/18355890/why-does-this-xpath-expression-return-an-empty-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!