How do you know what caused the bad performance? You can't go guessing it, the only way is to do some kind of profiling.
How do you handle locking for the parent collection or is it constant?
Maybe you need to add some debug output and see what really happens?