Most of the concurrent programs I have written were in Ada, which has full support for parallelism natively in the language. One of the nice benifits of this is that your parallel code is portable to any system with an Ada compiler. No special library required.