I have a JSON file containing quite a lot of test data, which I want to parse and push through an algorithm I\'m testing. It\'s about 30MB in size, with a list of 60,000 or so e
Those messages indicate that the application is spending more than 98% of its time collecting garbage.
I'd suspect that Scala is generating a lot of short-lived objects, which is what is causing the excessive GCs. You can verify the GC performance by adding the -verbosegc command line switch to java.
The default max heap size on Java 1.5+ server VM is 1 GB (or 1/4 of installed memory, whichever is less), which should be sufficient for your purposes, but you may want to increase the new generation to see if that improves your performance. On the Oracle VM, this is done with the -Xmn option. Try setting the following environment variable:
$JAVA_OPTS=-server -Xmx1024m -Xms1024m -Xmn2m -verbosegc -XX:+PrintGCDetails
and re-running your application.
You should also check out this tuning guide for details.
Try using Jerkson instead. Jerkson uses Jackson underneath, which repeatedly scores as the fastest and most memory efficient JSON parser on the JVM.
I've used both Lift JSON and Jerkson in production, and Jerkson's performance was signficantly better than Lift's (especially when parsing and generating large JSON documents).