Python xpath not working?

家住魔仙堡 提交于 2019-12-20 05:39:06

问题


Okay, this is starting to drive me a little bit nuts. I've tried several xml/xpath libraries for Python, and can't figure out a simple way to get a stinkin' "title" element.

The latest attempt looks like this (using Amara):

def view(req, url):
    req.content_type = 'text/plain'
    doc = amara.parse(urlopen(url))
    for node in doc.xml_xpath('//title'):
    req.write(str(node)+'\n')

But that prints out nothing. My XML looks like this: http://programanddesign.com/feed/atom/

If I try //* instead of //title it returns everything as expected. I know that the XML has titles in there, so what's the problem? Is it the namespace or something? If so, how can I fix it?


Can't seem to get it working with no prefix, but this does work:

def view(req, url):
    req.content_type = 'text/plain'
    doc = amara.parse(url, prefixes={'atom': 'http://www.w3.org/2005/Atom'})
    req.write(str(doc.xml_xpath('//atom:title')))

回答1:


You probably just have to take into account the namespace of the document which you're dealing with.

I'd suggest looking up how to deal with namespaces in Amara:

http://www.xml3k.org/Amara/Manual#namespaces

Edit: Using your code snippet I made some edits. I don't know what version of Amara you're using but based on the docs I tried to accommodate it as much as possible:

def view(req, url):
    req.content_type = 'text/plain'
    ns = {u'f' : u'http://www.w3.org/2005/Atom',
        u't' : u'http://purl.org/syndication/thread/1.0'}
    doc = amara.parse(urlopen(url), prefixes=ns)
    req.write(str(doc.xml_xpath(u'f:title')))



回答2:


It is indeed the namespaces. It was a bit tricky to find in the lxml docs, but here's how you do it:

from lxml import etree
doc = etree.parse(open('index.html'))
doc.xpath('//default:title', namespaces={'default':'http://www.w3.org/2005/Atom'})

You can also do this:

title_finder = etree.ETXPath('//{http://www.w3.org/2005/Atom}title')
title_finder(doc)

And you'll get the titles back in both cases.



来源:https://stackoverflow.com/questions/1584180/python-xpath-not-working

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!