I\'ve implemented the Barnes-Hut gravity algorithm in C as follows:
If your data is read-only, then no, you do not need to make a private copy of the tree for each thread. This is the biggest advantage that a shared memory threading model offers!
I'm not aware of any performance problems with such a model. If anything, it should be faster depending on if your CPUs can share some of their cache.