I am currently using HtmlUnit and Selenium to drive it (WebDriver) within my production code.
I am scaping and interacting with various websites programmatically wi
I'm using HtmlUnit for something similar in production and have had quite a bit of issues - mostly performance related. Currently I switched to snapshot version of HtmlUnit 2.10 where some important for me performance improvements were implemented (e.g. replacing ArrayList.contains() with HashSet.contains() on DomNode.addDomChangeListener()).
Still, the CPU load is quite high on JavaScript-heavy pages. Typically, I can't run more than 10 of them simultaneously on dual core Linux box. I believe HtmlUnit using Rhino (JavaScript engine) in interpreter mode only, which is pretty slow. Also, you need to be careful with releasing all resources used by HtmlUnit to avoid memory leaks.
All in all, it certainly noticeable that HtmlUnit was designed to run relatively short lived test cases and not long running server applications. It's possible to tweak it enough so it's manageable but certainly it could have been better.
Another approach I found promising is phantom-js, which is headless version of WebKit browser, native app which is much faster on running JavaScript.