I have a list of work-unit and I want to process them in parallel. Unit work is 8-15 seconds each, fully computational time, no I/O blocking. What I want to achieve is to ha