How to run processes piped with bash on multiple cores?

不羁岁月 提交于 2019-12-03 05:59:21

问题


I have a simple bash script that pipes output of one process to another. Namely:.

dostuff | filterstuff

It happens that on my Linux system (openSUSE if it matters, kernel 2.6.27) these both processes run on a single core. However, running different processes on different cores is a default policy that doesn't happen to trigger in this case.

What component of the system is responsible for that and what should I do to utilize multicore feature?

Note that there's no such problem on 2.6.30 kernel.

Clarification: Having followed Dennis Williamson's advice, I made sure with top program, that piped processes are indeed always run on the same processor. Linux scheduler, which usually does a really good job, this time doesn't do it.

I figure that something in bash prevents OS from doing it. The thing is that I need a portable solution for both multi-core and single-core machines. The taskset solution proposed by Dennis Williamson won't work on single-core machines. Currently I'm using:,

dostuff | taskset -c 0 filterstuff 

but this seems like a dirty hack. Could anyone provide a better solution?


回答1:


Suppose dostuff is running on one CPU. It writes data into a pipe, and that data will be in cache on that CPU. Because filterstuff is reading from that pipe, the scheduler decides to run it on the same CPU, so that its input data is already in cache.

If your kernel is built with CONFIG_SCHED_DEBUG=y,

# echo NO_SYNC_WAKEUPS > /sys/kernel/debug/sched_features

should disable this class of heuristics. (See /usr/src/linux/kernel/sched_features.h and /proc/sys/kernel/sched_* for other scheduler tunables.)

If that helps, and the problem still happens with a newer kernel, and it's really faster to run on separate CPUs than one CPU, please report the problem to the Linux Kernel Mailing List so that they can adjust their heuristics.




回答2:


Give this a try to set the CPU (processor) affinity:

taskset -c 0 dostuff | taskset -c 1 filterstuff

Edit:

Try this experiment:

  • create a file called proctest and chmod +x proctest with this as the contents:

    #!/bin/bash
    while true
    do
      ps
      sleep 2
    done  
    
  • start this running:

    ./proctest | grep bash
    
  • in another terminal, start top - make sure it's sorting by %CPU
  • let it settle for several seconds, then quit
  • issue the command ps u
  • start top -p with a list of the PIDs of the highest several processes, say 8 of them, from the list left on-screen by the exited top plus the ones for proctest and grep which were listed by ps - all separated by commas, like so (the order doesn't matter):

    top -p 1234, 1255, 1211, 1212, 1270, 1275, 1261, 1250, 16521, 16522
    
  • add the processor field - press f then j then Space
  • set the sort to PID - press Shift+F then a then Space
  • optional: press Shift+H to turn on thread view
  • optional: press d and type .09 and press Enter to set a short delay time
  • now watch as processes move from processor to processor, you should see proctest and grep bounce around, sometimes on the same processor, sometimes on different ones



回答3:


The Linux scheduler is designed to give maximum throughput, not do what you imagine is best. If you're running processes which are connected with a pipe, in all likelihood, one of them is blocking the other, then they swap over. Running them on separate cores would achieve little or nothing, so it doesn't.

If you have two tasks which are both genuinely ready to run on the CPU, I'd expect to see them scheduled on different cores (at some point).

My guess is, what happens is that dostuff runs until the pipe buffer becomes full, at which point it can't run any more, so the "filterstuff" process runs, but it runs for such a short time that dostuff doesn't get rescheduled until filterstuff has finished filtering the entire pipe buffer, at which point dostuff then gets scheduled again.



来源:https://stackoverflow.com/questions/1398588/how-to-run-processes-piped-with-bash-on-multiple-cores

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!