XPath _relative_ to given element in HTMLUnit/Groovy?

被刻印的时光 ゝ 提交于 2019-12-23 12:13:18

问题


I would like to evaluate an XPath expression relative to a given element.

I have been reading here: http://www.w3schools.com/xpath/default.asp

And it seems like one of the syntaxes below should work (esp no leading slash or descendant:)

However, none seem to work in HTMLUnit. Any help much appreciated (oh this is a groovy script btw). Thank you!

http://htmlunit.sourceforge.net/

http://groovy.codehaus.org/

Misha


#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
 <div class='leveltwo'>
    <div class='levelthree' />
 </div>
 <div class='leveltwo'>
    <div class='levelthree' />
    <div class='levelthree' />
 </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
 f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant:div[@class='levelone']") // this
gives namespace error
assert element.size()==0

Thank you!!!


回答1:


It is not clear from the definition of the problem, what is the element relative to which the XPath expressions are evaluated. Assuming that this is the document node, then the following XPath expressions will select the desired node:

   */*/div[@class='levelone']

   html/body/div[@class='levelone']

   descendant::div[@class='levelone']

You may have problem if in the actual XML document (not shown), there is a default namespace. In this case you need to define / register this namespace in your XPath-hosting language (I don't know groovy) and use the associated prefix, like this:

   */*/x:div[@class='levelone']

   x:html/x:body/x:div[@class='levelone']

   descendant::x:div[@class='levelone']



回答2:


Thank you so much. Apparently my error was using a single semicolon after descendant rather than two (doh)

#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
  <div class='leveltwo'>
     <div class='levelthree' />
  </div>
  <div class='leveltwo'>
     <div class='levelthree' />
     <div class='levelthree' />
  </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
  f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant::div[@class='levelone']")
assert element.size()==1

Doh!

Thank you!

Misha



来源:https://stackoverflow.com/questions/2980792/xpath-relative-to-given-element-in-htmlunit-groovy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!