htmlunit

Getting error “Provider com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl not found” in unit test but not in main program

旧城冷巷雨未停 提交于 2019-12-21 10:51:40
问题 I am building an application in C# which uses com.gargoylesoftware.htmlunit.WebClient to access and retrieve information from webpages. My application runs fine from the main project but when I try to build unit tests to test the project classes I get the following error: FactoryConfigurationError Message "Provider com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl not found" Source "IKVM.OpenJDK.XML.API" string StackTrace " at javax.xml.parsers.DocumentBuilderFactory

htmlunit: return a completely loaded page

依然范特西╮ 提交于 2019-12-20 14:31:50
问题 I am using HtmlUnit library for Java to manipulate websites programmatically. I can't find the working solution to my problem: How to determine that all AJAX calls are finished and return a completely loaded webpage? Here's what I have tried: Firstly I create WebClient instance and make call to my method processWebPage(String url, WebClient webClient) WebClient webClient = null; try { webClient = new WebClient(BrowserVersion.FIREFOX_3_6); webClient.setThrowExceptionOnScriptError(false);

Does HTMLUnit include a functional [HTML5] canvas 2D implementation able to render image data back to Java Code?

穿精又带淫゛_ 提交于 2019-12-20 06:22:02
问题 Basically, I'd like to be able to retrieve the HTMl[5] canvas image data created from, normal, JavaScript-based in-browser scripting. I'd like to do this in the context of a screen-scraping-type environment, from within [pure] Java code. HTMLUnit appears to fit some of the requirements. How would I go about retrieving the canvas-rendered image data, and how complete, or not, might HTMLUnit's canvas implementation currently be (version 2.13 at time of writing)? Two (2) HTMlUnit classes of note

Login to Google Account using HtmlUnit

扶醉桌前 提交于 2019-12-20 04:19:33
问题 I'm trying to login to Google Acccount through HtmlUnit, but still something is wrong and I'm getting login page. What I'm doing wrong? Set email Click next button Set password Click login button Go to GMail page and it's still login page (output below) My example code: WebClient client = new WebClient(BrowserVersion.CHROME); client.setHTMLParserListener(HTMLParserListener.LOG_REPORTER); client.setJavaScriptEngine(new JavaScriptEngine(client)); client.getOptions().setJavaScriptEnabled(true);

Cannot login programmatically to facebook using htmlunit

懵懂的女人 提交于 2019-12-20 01:42:11
问题 I have tried the code given in HTMLunit - Facebook Login and Using HTMLUnit to log into Facebook programmatically using Java. However I am not logged into facebook. With javascript enabled webClient.setJavaScriptEngine(new JavaScriptEngine(webClient)); webClient.getOptions().setJavaScriptEnabled(true); I get the page https://www.facebook.com/login.php?login_attempt=1&lwv=110. Also several warnings and errors are reported by htmlunit: WARN: (HtmlScript.java:472): Script is not JavaScript (type

HtmlUnit to invoke javascript from href to download a file

送分小仙女□ 提交于 2019-12-19 11:17:16
问题 I have tried to download a file that seems to have to be clicked vi a browser. The site uses a form for which inside are several hrefs to a javascript function named downloadFile. In this function, the element named poslimit is obtained by document.getElementById: function downloadFile(actionUrl, formId) { document.getElementById(formId).action=actionUrl; document.getElementById(formId).submit(); } The HTML source snippett: <form method="post" name="commandForm" action="position-limits" id=

HtmlUnit to invoke javascript from href to download a file

删除回忆录丶 提交于 2019-12-19 11:17:09
问题 I have tried to download a file that seems to have to be clicked vi a browser. The site uses a form for which inside are several hrefs to a javascript function named downloadFile. In this function, the element named poslimit is obtained by document.getElementById: function downloadFile(actionUrl, formId) { document.getElementById(formId).action=actionUrl; document.getElementById(formId).submit(); } The HTML source snippett: <form method="post" name="commandForm" action="position-limits" id=

Select default namespace in XPath with HtmlUnit

落爺英雄遲暮 提交于 2019-12-19 10:14:03
问题 I want to parse a Feedburner feed with HtmlUnit. The feed is this one: http://feeds.feedburner.com/alcoanewsreleases From this feed I want to read all item nodes, so normally a //item XPath should do the trick. Unfortunately that does not work in this case. groovy code snippet: def page = webClient.getPage("http://feeds.feedburner.com/alcoanewsreleases") def elements = page.getByXPath("//item") Sample of the XML feed: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl"

Html, handling a JSON response

此生再无相见时 提交于 2019-12-19 05:23:31
问题 I have a page that comes back as an UnexpectedPage in HtmlUnit, the response is JSON. Can I use HTMLUnit to parse this or will I need an additional library? 回答1: HtmlUnit doesn't support it. It can at highest execute a JS function. You need to check beforehand if the Content-Type of the returned response matches application/json and then use the suitable tool to parse it. Google Gson is useful in this. WebClient client = new WebClient(); Page page = client.getPage("https://stackoverflow.com

Example HtmlUnit Test Failing

前提是你 提交于 2019-12-19 04:37:08
问题 I am trying to run the sample HtmlUnit test case via Junit. My project is Maven based. Do I need to add ALL the depdencies listen under compile and test to my POM? http://htmlunit.sourceforge.net/dependencies.html Right now I have added the htmlunit dependencies, httpconnections and nekohtml. Sample test: @Test public void homePage() throws Exception { WebClient webClient = new WebClient(BrowserVersion.CHROME_16); final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");