By the way the slow down your seeing in classes using vector also occurs with standard types like int. Heres a multithreaded code:
#include
#include
#include
The behavior from the code shows the instantiation of vector is the longest part of the code. Once you get through that bottle neck. The rest of the code runs extremely fast. This is true no matter how many threads you are running on.
By the way ignore the absolutely insane number of includes. I have been using this code to test things for a project so the number of includes keep growing.