Is it always guaranteed that a multi-threaded application would run faster than a single threaded application?
I have two threads that populates data from a data so
Context switching overhead is not an issue until you have hundred of threads. Problem of context-switching is overestimated often (run task manager and notify how many threads are already launched). Spikes you observe rely on network communications that are rather unstable compared to local cpu computations.
I'd suggest to write scalable applications in SEDA (Staged Event Driven Architecture) when system is composed from several (5-15) components and each component has it's own message queue with bounded thread pool. You can tune size of pools and even apply algorithms that change thread pool sizes in order to make some components more productive than others (as all components share the same CPUs). You can tune size of pools for specific hardware that make SEDA applications extremely tunable.