Why one non-voluntary context switch per second?

﹥>﹥吖頭↗ 提交于 2019-11-30 14:51:08
gby

This is a guess, but an educated one - since you use an isolated CPU the scheduler does not schedule any task except your own on it with one exception - the vmstat code in the kernel has a timer that schedules a single work queue item on each CPU once per second to calculate memory usage statistics and this is what you are seeing gets scheduled each second.

The work queue code is smart enough to not schedule the work queue kernel thread if the core is 100% idle but not if it is running a single task.

You can verify this using ftrace. If the sched_switch tracer shows that the entity you switch to once every second or so (the value is rounded to the nearest jiffie events and the timer does not count when the cpu is idle so this might skew the timing) is the events/CPU_NUMBER task (or keventd for older kernels), then it's almost 100% that the cause is indeed the vmstat_update function setting its timer to queue a work queue item every second which the events kernel thread runs.

Note that the cycle at which vmstat sets its timer is configurable - you can set it to other value via the vm.stat_interval sysctl knob. Increasing this value will give you a lower rate of such interruptions at the cost of less accurate memory usage statistics.

I maintain a wiki with all the sources of interruptions to isolated CPU work loads here. I also have a patch in the works for getting vmstat to not schedule the work queue item if there is no change between one vmstat work queue run to the next - such as would happen if your single task on the CPU does not use any dynamic memory allocations. Not sure it will benefit you, though - it depends on your work load.

I strongly suggest you try to optimize the code itself so that when it's running on a CPU, you get the maximum out of it.
Anyhow, I am not sure this will work, but give it a try anyway and let us know:

What I'll basically do is just set the scheduling policy to be FIFO then give the process the maximum priority possible.

#include<sched.h>
struct sched_param sp = sched_get_priority_max(SCHED_FIFO);
int ret;

ret = sched_setscheduler(0, SCHED_FIFO, &sp);
if (ret == -1) {
  perror("sched_setscheduler");
  return 1;
}

Please keep in mind that any blocking statement your process makes is MOST LIKELY gonna cause the scheduler to get it off the CPU.

Source
Man page
EDIT:
Sorry, just noticed the pthread tag; the concept still holds so check out this man page: http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_setschedparam.3.html

If one interrupt per second on your dedicated CPU is still too much, then you really need to not go through the normal scheduler at all. May I suggest the real-time and isochronous priority levels, that can leave your process scheduled more reliably than the usual pre-emptive mechanisms?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!