Prevent HtmlUnit 2.13 from executing JavaScript

柔情痞子 提交于 2021-02-07 05:06:32

问题


Here is my code to get the page:

WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage(url);

The problem is the webClient always executes javascript automatically and throws me a list of error. I just want to get the raw source. How can I prevent it from executing script? I've found there is a way in version 2.9:

webClient.setJavaScriptEnabled(false);

But setJavaScriptEnabled() function was deprecated. Anyone knows how to solve this problem? Please help me. Thank you so much.


回答1:


Although setJavaScriptEnabled(boolean) was deprecated it was added to the WebClientOptions member of the WebClient. Here is the doc.

In order to disable JavaScript you should do this:

webClient.getOptions().setJavaScriptEnabled(false);

Additionally, if you you want to get the raw HTML code from the webpage you should take a look at this question:

How to get the pure raw HTML of a page in HTMLUnit while ignoring JavaScript and CSS?

Take into account that even the asXml() method change the formatting as well as the content of the web page you fetch (even if JavaScript is disabled).



来源:https://stackoverflow.com/questions/20045951/prevent-htmlunit-2-13-from-executing-javascript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!