C Shell hanging when dealing with piping

痴心易碎 提交于 2019-12-02 08:15:08

You aren't closing enough file descriptors in the children (or, in this case, in the parent).

Rule of thumb: If you dup2() one end of a pipe to standard input or standard output, close both of the original file descriptors returned by pipe() as soon as possible. In particular, you should close them before using any of the exec*() family of functions.

The rule also applies if you duplicate the descriptors with either dup() or fcntl() with F_DUPFD

In your code, you create all the pipes before you fork any children; therefore, each child needs to close all the pipe file descriptors after duplicating the one or two that it is going to use for input or output.

The parent process must also close all the pipe descriptors.

Also, the parent should not wait for children to complete until after launching all the children. In general, children will block with full pipe buffers if you make them run sequentially. You also defeat the benefits of parallelism. Note, however, that the parent must keep the pipes open until it has launched all the children — it must not close them after it launches each child.

For your code, the outline operation should be:

  • Create N pipes
  • For each of N (or N+1) children:
    1. Fork.
    2. Child duplicates standard input and output pipes
    3. Child closes all of the pipe file descriptors
    4. Child executes process (and reports error and exits if it fails)
    5. Parent records child PID.
    6. Parent goes on to next iteration; no waiting, no closing.
  • Parent now closes N pipes.
  • Parent now waits for the appropriate children to die.

There are other ways of organizing this, of greater or lesser complexity. The alternatives typically avoid opening all the pipes up front, which reduces the number of pipes to be closed.

'Appropriate children' means there are various ways of deciding when a pipeline (sequence of commands connected by pipes) is 'done'.

  • One option is to wait for the last command in the sequence to exit. This has advantages — and is the traditional way to do it. Another advantage is that the parent process can launch the last child; the child can launch its predecessor in the pipeline, back to the first process in the pipeline. In this scenario, the parent never creates a pipe, so it doesn't have to close any pipes. It also only has one child to wait for; the other processes in the pipeline are descendents of the one child.
  • Another option is to wait for all the processes to die(1). This is more or less what Bash does. This allows Bash to know the exit status of each element of the pipeline; the alternative does not permit that — which is relevant to set -o pipefail and the PIPEFAIL array.

Can you help me understand why the dup2 statement for the middle pipes is dup2(fd[(2*j)+1], fileno(stdout)) and dup2(fd[2*(j-1)], fileno(stdin))? I got it off Google and it works, but I'm unsure why.

  • fileno(stdout) is 1.
  • fileno(stdin) is 0.
  • The read end of a pipe is file descriptor 0 (analogous to standard input).
  • The write end of a pipe is file descriptor 1 (analogous to standard output).
  • You have an array int fd[2*N]; for some value of N > 1, and you get a pair of file descriptors for each pipe.
  • For an integer k, fd[k*2+0] is the read descriptor of a pipe, and fd[k*2+1] is the read descriptor.
  • When j is neither 0 nor (N-1), you want it to read from the previous pipe and to write to its pipe:
    • fd[(2*j)+1] is the write descriptor of pipe j — which gets connected to stdout.
    • fd[2*(j-1)] is the read descriptor of pipe j-1 — which gets connected to stdin.
  • So, the two dup2() calls connect the the correct pipe file descriptors to standard input and standard output of process j in the pipeline.

(1) There can be obscure scenarios where this leaves the parent hung indefinitely. I emphasize obscure; it requires something like a process that hangs around as a daemon without forking.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!