webharvest

Webharvest If and null test

醉酒当歌 提交于 2019-12-23 02:28:16
问题 I'm trying to make my program check the return of an xpath expression and if it is null it should try a different one, how do I do this? I have tried all the examples on the website and the blank single quotes will not compile. <var-def name="googleResults"> <xpath expression="//div[@id='center_col']//div[@id='search']//div[@id='ires']//ol/li/div//b/div/text()"> <html-to-xml> <http url="http://google.com/shopping?q=asus laptops&hl=en"/> </html-to-xml> </xpath> </var-def> <var-def name=

Scraping content of webpage using Web-harvest

扶醉桌前 提交于 2019-12-08 12:33:12
问题 I want to scrape particular contents from webpages, for this I am using web harvest. It is working well for other website when I tried to scrape contents but it is not scraping contents for this URL. My Java code is here: import org.webharvest.definition.ScraperConfiguration; import org.webharvest.runtime.Scraper; import org.webharvest.runtime.variables.Variable; import java.io.FileNotFoundException; public class App { public static void main(String[] args) { try { ScraperConfiguration config

Reading dynamic web page content in java

隐身守侯 提交于 2019-12-07 21:31:17
问题 I need help reading the contents of a webpage. Currently i am using the following method to read the contents BufferedReader in = new BufferedReader(new InputStreamReader(page.openStream())); String inputLine; while ((inputLine = in.readLine()) != null) {Content = Content + inputLine;} However with this method there is a problem. . some jsp pages have ajax in them which randomly updates a css class of a webpage like so Javascript code just to give an idea: if (request.readyState === 4 &&

Webharvest If and null test

余生颓废 提交于 2019-12-06 23:42:28
I'm trying to make my program check the return of an xpath expression and if it is null it should try a different one, how do I do this? I have tried all the examples on the website and the blank single quotes will not compile. <var-def name="googleResults"> <xpath expression="//div[@id='center_col']//div[@id='search']//div[@id='ires']//ol/li/div//b/div/text()"> <html-to-xml> <http url="http://google.com/shopping?q=asus laptops&hl=en"/> </html-to-xml> </xpath> </var-def> <var-def name="productTruth"> <case> <if condition="${googleResults != null}"> <var name="googleResults"/> </if> <else>