问题
So far I have a working code that use HtmlUnit
to get a page asXML
However, I find it that, it is processing everything on the page including shockwave flash objects. Which makes the processing slow.
I just need it to process, the plain HTML and Javascript, so that it will be faster.
This is my code:
HtmlPage page = webClient.getPage(sb.toString());
webClient.getJavaScriptEngine().pumpEventLoop(PUMP_TIME);
pageString = page.asXml();
page.asXml()
is quite slow, maybe because of the points I stated above?
Is there a way to tell HtmlUnit not to process unecessary parts of the page?
This is where I see that the page processing stuck up for quite some time (many times):
[INFO] SEVERE: runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash'.] sourceName=[http://partner.googleadservices.com/gampad/google_ads_gpt.js] line=[9] lineSource=[null] lineOffset=[0]
- Also does HtmlUnit loads css and images too in memory?
回答1:
HtmlUnit can't process flash. It does take a lot of time to process JS, though. Probably, the JS is getting something from the net and that is also taking more time. Anyway, note that the log is actually an INFO
and not a SEVERE
and basically it is telling you that it is not creating any flash object.
I would recommend you to avoid the processing of JS, if possible.
来源:https://stackoverflow.com/questions/16644502/faster-page-processing-with-htmlunit