I\'m running CF 9.0.1 on Ubuntu on an \"Medium\" Amazon EC2 instance. CF has been seizing-up intermittently (several times per day...but notably not isolated to hours of pea
I've had a number of 'high-cpu in production' type bugs and the way i've always dealt with them is this:
Use jstack PID >> stack.log to dump 5 of stack traces, 5 seconds apart. Number of traces and timing not critical.
Open the log in Samurai. You get a view of the threads at each time you did a dump. Threads processing your code start web- (for requests using the built-in server) and jrpp- for requests coming in through Apache/IIS.
Read the history of each thread. You're looking for the stack being very similar in each dump. If a thread looks like it's handling the same request the whole time, the bits that vary near the top will point to where an infinite loop is happening.
Feel free to dump a stack trace somewhere online and point us to it.
The other technique I've used to understand what's going on is to modify apache's httpd.conf to log time taken: %D and record session id: %{jsessionid} which allows you to trace individual users in the run-up to hangs and to do some nice stats/graphs with the data (I use LogParser to crunch the numbers and output to CSV, followed by Excel to graph the data):
LogFormat "%h %l %u %t "%r" %>s %b %D %{jsessionid}" customAnalysis
CustomLog logs/analysis_log customAnalysis
One other technique I've just remembered is to enable CF Metrics, which will get you some measure of what the server was up to in the runup to a hang. I set this to log every 10 seconds and change the format to be CSV, so I can grep the metrics from the event log and then run them through Excel to graph server load in the runup to crashes.
Barny
A few weeks ago, I had a server that kept maxing out the CPU utilization on the JRun process, and would periodically restart it, only to have it ramp right back up to 100% within 24 hours. I fussed with JVM settings and the like too, until finally discovering, much to my embarrassed surprise, an infinite loop in my code. There was a WHILE loop with a condition that would never fail to be met. Oops.
So maybe you made a simple mistake in your code, and it's got nothing to do with the server config, fwiw.
+1 for the FusionReactor demo. That'll at least give you some clues.
To find out what is maxing out your procs requires a lot of information that is "internal" to your system. It's hard to do it from outside looking at things like queued requests etc. One thing is certain - altering simultaneous request setting to a very high number is not going to do the trick :) All it will do is remove something that is designed to keep CF from gloaming onto too much processor.
Here's my list of things that max out CPU usage.
There are many other reasons this can happen - among them (as you surmise) code issues that crop up as certain scripts are run. Long running requests, file uploads, heavy lifting scheduled tasks, index bot traffic generating traffic or spawning too many sessions.... the list goes on.
Hopefully something on this list I provided will strike you as possible. good luck.
(and yes FR or even the CF monitor are good tools to help you tease all this out :).
Have you tried using the ColdFusion Server monitor that comes with Coldfusion ?
You would have to increase active thread pool size. Please check the below links
http://www.talkingtree.com/blog/index.cfm/2005/11/28/Request-timed-out-waiting-for-an-available-thread-to-run
http://helpx.adobe.com/coldfusion/kb/coldfusion-mx-6-1-request.html
Happy coding!!!