Where does the OS store argv and argc when a child process is executed?

时光怂恿深爱的人放手 提交于 2019-12-03 08:20:31

On Linux, at least on the architectures I've played with, the process starts with %esp pointing to something like:

argc | argv[0] | argv[1] | ... argv[argc - 1] | argv[argc] == NULL | envp[0] | envp[1] ... envp[?] == NULL

The first function called is traditionally named _start, and its job is to calculate (argc = %esp, argv = ((char *)%esp) + 1, envp = ((char *)%esp) + argc + 2), then call main with the appropriate calling convention.

On x86, the arguments get passed on the stack.

On amd64, they get passed in registers %rdi, %rsi, and %rdx.

On mips, Google tells me there are several different calling conventions in use - including O32, N32, N64 - but all of them use $a0, $a1, $a2 first.

The process is different for different operating systems, and indeed differs depending on how a new process is created. Since I'm more familiar with how modern Microsoft OS's handle this, I'll start there, and make a reference to nix's at the end.

When the [Microsoft] OS creates a process, it allocates a process environment block to hold data specific to that process. This includes, among other things, command line arguments with which the program was invoked. This process environment block is allocated out of the target process's address space, and a pointer to it is provided to the process's entry point. The process environment block for a child process is generally initialized by copying the parent process's environment block into the new process's address space - there's no direct sharing of memory involved.

In the case of a C-based program, the entry point is not the main() function that the programmer provides. Rather, it is a routine provided by the C runtime library that is responsible for initializing the runtime environment before handing control to the programmer's main().

There is a lot of stuff to initialize, but one of the aspects is setting up the argc and argv values. To do this, the runtime will consult the process environment block to find the program name and parameters with which it was invoked. It will then (typically) allocate values for argv out of the process heap (i.e., using something like malloc()), and assign argc to the number of params found (plus one, for the program name).

The actual values for argc and argv are pushed onto the stack like any other parameters are passed in C, because the call to main() by the C runtime is just a normal function call.

So, when the code you write inside main() in the child process accesses argv, it will be reading values out of its own process heap. The source of those values is the process environment block (stored by the OS in the local address space), which was originally initialized by copying the process environment block from the parent process.

On *nix platforms, things are quite a bit different. The primary difference for the present discussion is that nix will store the command-line arguments for the new process directly into the stack space of the process's initial thread. (It also stores environment variables here.) So on *nix, main is invoked with the argv parameter pointing to values stored in the stack itself.

You can glean some of this in the manpage for execve, while the Linux Programming Interface by Michael Kerrisk has a good description in section 6.4 that you might find excerpted online.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!