Which Java synchronization construct is likely to provide the best performance for a concurrent, iterative processing scenario with a fixed number of threads like the one outli
Just hit upon this thread, and even though it's almost a year old let me point you to the "jbarrier" library we developed at the University of Bonn a few months ago:
http://net.cs.uni-bonn.de/wg/cs/applications/jbarrier/
The barrier package targets exactly the case where the number of worker threads is <= the number of cores. The package is based on busy-wait, it supports not only barrier actions but also global reductions, and apart from a central barrier it offers tree-structured barriers for parallelizing the synchronization/reduction parts even further.