Faster page processing with HtmlUnit

让人想犯罪 __ 提交于 2019-12-25 04:51:29

问题


So far I have a working code that use HtmlUnit to get a page asXML

However, I find it that, it is processing everything on the page including shockwave flash objects. Which makes the processing slow.

I just need it to process, the plain HTML and Javascript, so that it will be faster.

This is my code:

        HtmlPage page = webClient.getPage(sb.toString());
        webClient.getJavaScriptEngine().pumpEventLoop(PUMP_TIME);
        pageString = page.asXml();

page.asXml() is quite slow, maybe because of the points I stated above?

Is there a way to tell HtmlUnit not to process unecessary parts of the page?

This is where I see that the page processing stuck up for quite some time (many times):

[INFO] SEVERE: runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash'.] sourceName=[http://partner.googleadservices.com/gampad/google_ads_gpt.js] line=[9] lineSource=[null] lineOffset=[0]
  • Also does HtmlUnit loads css and images too in memory?

回答1:


HtmlUnit can't process flash. It does take a lot of time to process JS, though. Probably, the JS is getting something from the net and that is also taking more time. Anyway, note that the log is actually an INFO and not a SEVERE and basically it is telling you that it is not creating any flash object.

I would recommend you to avoid the processing of JS, if possible.



来源:https://stackoverflow.com/questions/16644502/faster-page-processing-with-htmlunit

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!