HtmlUnit is throwing Out Of Memory and maybe leaking memory

拜拜、爱过 提交于 2019-12-01 11:19:34
Mosty Mostacho

I've had a similar issue. It ended up being an issue with auto-loading of frames... a feature that can't be disabled.

Take a look at this: Extremely simple code not working in HtmlUnit

It might be of help.

Update

Current version of HtmlUnit is 2.10. I started using HtmlUnit from version 2.8 and each new version ended up eating more memory. I got to a point in which fetching 5 pages with javascript enabled resulted in a process of 2GB.

There are many ways to improve this situation from a javascript point of view. However, when you can't modify the javascript (eg: if you are crawling a site) your hands are tied. Disabling javascript is, of course, the best way to go. However, this might result in fetched pages being different from the expected ones.

I did manage to overcome this situation, though. After many tests, I noticed that it might not be an issue with HtmlUnit (which I thought was the guilty one from the beginning). It seemed to be the JVM. Changing from Sun's JVM to OpenJDK did the trick and now the process instead of eating 2GB of memory only requires 200MB. I'm adding version information.

Sun's (Oracle) 32-bit JVM:

$java -version
java version "1.6.0.26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) Server VM (build 20.1-b02, mixed mode)

OpenJDK 32-bit JVM:

$java -version
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
OpenJDK Server VM (build 14.0-b16, mixed mode)

Operative system:

$ uname -a
Linux vostro1015 2.6.32-5-686-bigmem #1 SMP Sun May 6 04:39:05 UTC 2012 i686 GNU/Linux

Please, share your experience with this.

Give more memory to the JVM by adding this to the java command line that starts the JVM in which Selenium is running:

-Xmx512m

This example give a maximum of 512 Mb to the JVM.

It depends on where you're running Selenium from. If maven, you can add it to the MAVEN_OPTS environment variable, if Eclipse, you'll need to edit the run configuration for the test class, etc.

Related to HtmlUnit:

Do not forget to call webClient.closeAllWindows();. I always put it in a finally-block around the area I use the webclient. This way it is sure that all javascript is stopped and all resources are released.

Aslo useful is setting for the webClient:

    webClient.setJavaScriptTimeout(JAVASCRIPT_TIMOUT);
    webClient.setTimeout(WEB_TIMEOUT);
    webClient.setCssEnabled(false);  // for most pages you do not need css to be enabled
    webClient.setThrowExceptionOnScriptError(false); // I never want Exceptions because of javascript

JAVASCRIPT_TIMOUT should be not too high long running javascript may be a reason for memory problems. WEB_TIMEOUT think about how long you want to wait maximal.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!