htmlunit

Selenium vs HtmlUnit?

偶尔善良 提交于 2019-11-27 10:47:46
I am trying to understand testing framework better and been looking into Selenium. I've used HTMLUnit before, mainly when I needed to scrape some information off website or the likes. In the context of writing test automation, what's the advantage / disadvantages of Selenium vs HTMLUnit? Looks to me Selenium is more complicated to set up than HTMLUnit, although at the same time there's a HTMLUnitDriver for Selenium which I think behave the exact same way as in HTMLUnit itself? Selenium obviously provides more robust framework, it has the Selenium RC for pararel testing, it also has different

An HtmlUnit alternative for android?

无人久伴 提交于 2019-11-27 09:34:56
An alternative that allows me to fill an HTML form that has checkboxes and radiobuttons. I was creating this android app that asks user input and sends that data to a website with an html form, fills it, submits the form, and returns the following results page. I already managed to send data to the html form and retrieve the page using the HtmlUnit library in eclipse (I have posted the Java code for that below). However, when I copied that code to my Android project I found out that Android does not support the HtmlUnit library. Is there another alternative to HtmlUnit for Android? The

HtmlUnit Only Displays Host HTML Page for GWT App

Deadly 提交于 2019-11-27 09:09:04
I am using HtmlUnit API to add crawler support to my GWT app as follows: PrintWriter out = null; try { resp.setCharacterEncoding(CHAR_ENCODING); resp.setContentType("text/html"); url = buildUrl(req); out = resp.getWriter(); WebClient webClient = webClientProvider.get(); // set options WebClientOptions options = webClient.getOptions(); options.setCssEnabled(false); options.setThrowExceptionOnScriptError(false); options.setThrowExceptionOnFailingStatusCode(false); options.setRedirectEnabled(true); options.setJavaScriptEnabled(true); // set timeouts webClient.setJavaScriptTimeout(0); webClient

How can I tell HtmlUnit's WebClient to download images and css?

不想你离开。 提交于 2019-11-27 07:38:17
问题 How can I make WebClient download external css stylesheets and image bodies just like a usual web browser does? 回答1: What I'm doing right now is: public static final HashMap<String, String> acceptTypes = new HashMap<String, String>(){{ put("html", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"); put("img", "image/png,image/*;q=0.8,*/*;q=0.5"); put("script", "*/*"); put("style", "text/css,*/*;q=0.1"); }}; protected void downloadCssAndImages(HtmlPage page) { String

HtmlUnit ignore JavaScript errors?

馋奶兔 提交于 2019-11-27 06:43:58
问题 I'm trying to traverse through a website but on one of their pages I get this error: EcmaError: lineNumber=[671] column=[0] lineSource=[null] name=[TypeError] sourceName=[https://reservations.besodelsolresort.com/asp/CalendarPopup.js] message=[TypeError: Cannot read property "parentNode" from undefined (https://reservations.besodelsolresort.com/asp/CalendarPopup.js#671)] com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "parentNode" from undefined (https:/

href field missing when I get the page using jsoup or htmlunit

元气小坏坏 提交于 2019-11-27 06:32:17
问题 I'm trying to parse google images search result. I'm trying to get the href attribute of an element. I've noticed that the href field is missing when I get the page programmatically (this happens with both jsoup and htmlunit). Comparing the element of the page got programmatically through java and the element of the page loaded by the actual browser, the only difference is, indeed, the href field that is missing (the rest is the same). The href attribute (IMAGE_LINK) is the following: /imgres

HtmlUnit getByXpath returns null

荒凉一梦 提交于 2019-11-27 06:30:17
问题 I am coding with Groovy, however, I don't believe its a language specific set of questions. I actually have two questions First Question I've run into an issue while using HtmlUnit. It is telling me that what I am trying to grab is null. The page I'm testing it on is: http://browse.deviantart.com/resources/applications/psbrushes/?order=9&offset=0#/dbwam4 My code: client = new WebClient(BrowserVersion.FIREFOX_3) client.javaScriptEnabled = false page = client.getPage(url) //coming up as null

How to combine scrapy and htmlunit to crawl urls with javascript

北城以北 提交于 2019-11-27 04:02:59
问题 I'm working on Scrapy to crawl pages,however,I can't handle the pages with javascript. People suggest me to use htmlunit, so I got it installed,but I don't know how to use it at all.Dose anyone can give an example(scrapy + htmlunit) for me? Thanks very much. 回答1: To handle the pages with javascript you can use Webkit or Selenium. Here some snippets from snippets.scrapy.org : Rendered/interactive javascript with gtk/webkit/jswebkit Rendered Javascript Crawler With Scrapy and Selenium RC 回答2:

Getting Jsoup to support dynamically generated html by JavaScript

限于喜欢 提交于 2019-11-27 02:02:37
right now I'm working on a webcrawler. This one should parse some specific sites and give me an output into an xml-file. Up to this point, it's no problem. The Crawler works and you can customize it realy quickly via a cfg-file. I use Jsoup to parse the HTML-content. I just added a few more sites and noticed that I got a huge problem with HTML-content that is created via JavaScript. Isn't there a way to make Jsoup supporting Javascript? Or at least get the full HTML-content I can see in my browser. I already tried HtmlUnit, but this one didn't do well. It did not give me the content I would

HtmlUnitDriver (HtmlUnit) vs GhostDriver (PhantomJS)?

为君一笑 提交于 2019-11-27 00:52:55
问题 We are in the middle of choosing our headless browser driver solution that will be some implementation of Selenium WebDriver. There is the GhostDriver, which leverages the PhantomJS in the backend on the one side and HtmlUnitDriver which based on HtmlUnit on the other. PhantomJS uses WebKit, the rendering engine of Safari, to render the pages while HtmlUnitDriver uses the Rhino engine which no other browsers use (it's just "simulating" browser behaviour. The last fact considered as a con,