I am trying to improve the performance of the threaded application with real-time deadlines. It is running on Windows Mobile and written in C / C++. I have a suspicion that
You can't estimate it. You need to measure it. And it's going to vary depending on the processor in the device.
There are two fairly simple ways to measure a context switch. One involves code, the other doesn't.
First, the code way (pseudocode):
DWORD tick;
main()
{
HANDLE hThread = CreateThread(..., ThreadProc, CREATE_SUSPENDED, ...);
tick = QueryPerformanceCounter();
CeSetThreadPriority(hThread, 10); // real high
ResumeThread(hThread);
Sleep(10);
}
ThreadProc()
{
tick = QueryPerformanceCounter() - tick;
RETAILMSG(TRUE, (_T("ET: %i\r\n"), tick));
}
Obviously doing it in a loop and averaging will be better. Keep in mind that this doesn't just measure the context switch. You're also measuring the call to ResumeThread and there's no guarantee the scheduler is going to immediately switch to your other thread (though the priority of 10 should help increase the odds that it will).
You can get a more accurate measurement with CeLog by hooking into scheduler events, but it's far from simple to do and not very well documented. If you really want to go that route, Sue Loh has several blogs on it that a search engine can find.
The non-code route would be to use Remote Kernel Tracker. Install eVC 4.0 or the eval version of Platform Builder to get it. It will give a graphical display of everything the kernel is doing and you can directly measure a thread context switch with the provided cursor capabilities. Again, I'm certain Sue has a blog entry on using Kernel Tracker as well.
All that said, you're going to find that CE intra-process thread context switches are really, really fast. It's the process switches that are expensive, as it requires swapping the active process in RAM and then doing the migration.