What\'s the fastest, best way on modern Linux of achieving the same effect as a fork
-execve
combo from a large process ?
My proble
I've come across this blog post: http://blog.famzah.net/2009/11/20/a-much-faster-popen-and-system-implementation-for-linux/
pid = clone(fn, stack_aligned, CLONE_VM | SIGCHLD, arg);
Excerpt:
The system call clone() comes to the rescue. Using clone() we create a child process which has the following features:
- The child runs in the same memory space as the parent. This means that no memory structures are copied when the child process is created. As a result of this, any change to any non-stack variable made by the child is visible by the parent process. This is similar to threads, and therefore completely different from fork(), and also very dangerous – we don’t want the child to mess up the parent.
- The child starts from an entry function which is being called right after the child was created. This is like threads, and unlike fork().
- The child has a separate stack space which is similar to threads and fork(), but entirely different to vfork().
- The most important: This thread-like child process can call exec().
In a nutshell, by calling clone in the following way, we create a child process which is very similar to a thread but still can call exec():
However I think it may still be subject to the setuid problem:
http://ewontfix.com/7/ "setuid and vfork"
Now we get to the worst of it. Threads and vfork allow you to get in a situation where two processes are both sharing memory space and running at the same time. Now, what happens if another thread in the parent calls setuid (or any other privilege-affecting function)? You end up with two processes with different privilege levels running in a shared address space. And this is A Bad Thing.
Consider for example a multi-threaded server daemon, running initially as root, that’s using posix_spawn, implemented naively with vfork, to run an external command. It doesn’t care if this command runs as root or with low privileges, since it’s a fixed command line with fixed environment and can’t do anything harmful. (As a stupid example, let’s say it’s running date as an external command because the programmer couldn’t figure out how to use strftime.)
Since it doesn’t care, it calls setuid in another thread without any synchronization against running the external program, with the intent to drop down to a normal user and execute user-provided code (perhaps a script or dlopen-obtained module) as that user. Unfortunately, it just gave that user permission to mmap new code over top of the running posix_spawn code, or to change the strings posix_spawn is passing to exec in the child. Whoops.