I'll put in another answer for valgrind, especially the callgrind portion with the UI. It can handle multiple threads by profiling each thread for cache misses, etc. They also have a multi-thread error checker called helgrind, but I've never used it and don't know how good it is.