htmlunit

Extremely simple code not working in HtmlUnit

邮差的信 提交于 2019-11-26 21:43:21
问题 I'm working with HtmlUnit 2.9 (the stable version that was released this month). Do you have any idea why the following code is not working? public class Main { public static void main(String[] args) { WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6); webClient.setCssEnabled(true); webClient.setCssErrorHandler(new SilentCssErrorHandler()); webClient.setThrowExceptionOnFailingStatusCode(false); webClient.setThrowExceptionOnScriptError(false); webClient.setRedirectEnabled(false);

HttpUnit/HtmlUnit equivalent for android

可紊 提交于 2019-11-26 18:26:41
问题 I'm looking for a browser-simulating library on android, which handles things like loading a website (http/https) Redirections: HTTP (3xx Status Codes), JavaScript, HMTL tags filling out html-forms easy html parsing (could fall back to JSoup for that one) HttpUnit or HtmlUnit would do just fine, but both of them are a pain to get running on android. Is there any other option other than (Android)HttpClient (and therefore doing lots of the above on my own)? Or can I somehow get use of the

Jsoup+HttpUnit爬取搜狐新闻

旧巷老猫 提交于 2019-11-26 17:21:00
怎么说呢,静态的页面,但我也写了动态的接口支持,方便后续爬取别的新闻网站使用。 一个接口,接口有一个抽象方法pullNews用于拉新闻,有一个默认方法用于获取新闻首页: public interface NewsPuller { void pullNews(); // url:即新闻首页url // useHtmlUnit:是否使用htmlunit default Document getHtmlFromUrl(String url, boolean useHtmlUnit) throws Exception { if (!useHtmlUnit) { return Jsoup.connect(url) //模拟火狐浏览器 .userAgent("Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)") .get(); } else { WebClient webClient = new WebClient(BrowserVersion.CHROME); webClient.getOptions().setJavaScriptEnabled(true); webClient.getOptions().setCssEnabled(false); webClient.getOptions()

Selenium vs HtmlUnit?

旧街凉风 提交于 2019-11-26 15:19:27
问题 I am trying to understand testing framework better and been looking into Selenium. I've used HTMLUnit before, mainly when I needed to scrape some information off website or the likes. In the context of writing test automation, what's the advantage / disadvantages of Selenium vs HTMLUnit? Looks to me Selenium is more complicated to set up than HTMLUnit, although at the same time there's a HTMLUnitDriver for Selenium which I think behave the exact same way as in HTMLUnit itself? Selenium

An HtmlUnit alternative for android?

青春壹個敷衍的年華 提交于 2019-11-26 14:48:40
问题 An alternative that allows me to fill an HTML form that has checkboxes and radiobuttons. I was creating this android app that asks user input and sends that data to a website with an html form, fills it, submits the form, and returns the following results page. I already managed to send data to the html form and retrieve the page using the HtmlUnit library in eclipse (I have posted the Java code for that below). However, when I copied that code to my Android project I found out that Android

how to use htmlunit with my android project

丶灬走出姿态 提交于 2019-11-26 14:38:26
问题 I have downloaded htmlunit 2.11 zip. i have extract it. then i have tried to paste them in my project's libs folder. from the libs folder i have added them in build path. then i get this error, while i was trying to run my app conversion to dalvik format failed with error 1 then, from stackoverflow i found that, one said to delete xalan xercesImpl xml-apis . i deleted them. but getting this error: Error generating final archive: Found duplicate file for APK: about.html error message also

Turning HtmlUnit Warnings off

丶灬走出姿态 提交于 2019-11-26 14:21:24
Do you know how can I turn Warnings, Notes, Errors in HtmlUnit off? Arsen Zahray Put this somewhere around the start of your code; it will shut its dirty mouth: LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog"); java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF); java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF); webClient = new WebClient(bv); webClient.setCssEnabled(false); webClient.setIncorrectnessListener(new IncorrectnessListener() { @Override public

Get the changed HTML content after it's updated by Javascript? (htmlunit)

拈花ヽ惹草 提交于 2019-11-26 13:44:25
I'm having some trouble figuring out how to get the content of some HTML after javascript has updated it. Specifically, I'm trying to get the current time from US Naval Observatory Master Clock . It has an h1 element with the ID of USNOclk in which it displays the current time. When the page first loads, this element is set to display "Loading...", and then javascript kicks in and updates it to the current time via function showTime() { document.getElementById('USNOclk').innerHTML="Loading...<br />"; xmlHttp=GetXmlHttpObject(); if (xmlHttp==null){ document.getElementById('USNOclk').innerHTML=

Android Web Scraping with a Headless Browser [closed]

社会主义新天地 提交于 2019-11-26 10:19:03
问题 I have spent a day on researching a library that can be used to accomplish the following: Retrieve the full contents of a webpage like in the background without rendering result to a view. The lib should support pages that fires off ajax requests to load some additional result data after the initial HTML has loaded for example. From the resulting html I need to grab elements in xpath or css selector form. In future I also possibly need to navigate to a next page (fire off events, submitting

HTMLUnit doesn&#39;t wait for Javascript

空扰寡人 提交于 2019-11-26 06:39:40
问题 I have a GWT based page that I would like to create an HTML snapshot for it using HtmlUnit. The page loads using Ajax/JavaScript information on a product, so for about 1 second there is a Loading... message and then the content appears. The problem is that HtmlUnit doesn\'t seem to capture the information and all I\'m getting is the \"Loading...\" span. Below is an experimental code with HtmlUnit where I try to give it enough time to wait for the loading of the data but it doesn\'t seem to