C and resource protection in memory

When we compile a C program, it just generates some machine-understandable code. This code can directly run on the hardware, telling from this question.

So my questions are:

If a C program can directly run on the hardware, how can the kernel handle the resource allocation for this program?
If the executable generated from the compiler is in pure machine-understandable form, then how do the privileged and non-privileged modes work?
How does the kernel manage the permission of hardware resources if a program can directly run on the hardware not through the kernel?

If a C program can directly run on the hardware how can kernel handle the resource allocation to this program.

The kernel is responsible for managing the resources of the computer as a whole, including resources like the hardware. This means that, for user-level applications to be able to access things like hardware devices, write to a terminal, or read a file, they have to ask the kernel for permission. This is done by using the system calls exposed by the OS, as mentioned by @Marcus.

However, I wouldn't say that the program runs directly on hardware in the sense that it does not interact with the hardware directly, as a kernel module/driver would. A client program will set up the arguments for a system call and then interrupt the kernel and wait until the kernel services the interrupt request the program made.

This is why OSes today are said to run in protected mode, as opposed to the old days when they ran in real mode and a program could, for example, mess around with hardware resources directly --and potentially screw things up.

This distinction becomes very clear if you try writing a trivial "hello world" program in x86 assembly. I wrote and documented this one several years ago, reproduced below:

;
; This program runs in 32-bit protected mode.
;  build: nasm -f elf -F stabs name.asm
;  link:  ld -o name name.o
;
; In 64-bit long mode you can use 64-bit registers (e.g. rax instead of eax, rbx instead of ebx, etc.)
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                           ; section for initialized data
str:     db 'Hello world!', 0Ah         ; message string with new-line char at the end (10 decimal)
str_len: equ $ - str                    ; calcs length of string (bytes) by subtracting the str's start address
                                            ; from this address ($ symbol)

section .text                           ; this is the code section
global _start                           ; _start is the entry point and needs global scope to be 'seen' by the
                                            ; linker --equivalent to main() in C/C++
_start:                                 ; definition of _start procedure begins here
    mov eax, 4                   ; specify the sys_write function code (from OS vector table)
    mov ebx, 1                   ; specify file descriptor stdout --in gnu/linux, everything's treated as a file,
                                             ; even hardware devices
    mov ecx, str                 ; move start _address_ of string message to ecx register
    mov edx, str_len             ; move length of message (in bytes)
    int 80h                      ; interrupt kernel to perform the system call we just set up -
                                             ; in gnu/linux services are requested through the kernel
    mov eax, 1                   ; specify sys_exit function code (from OS vector table)
    mov ebx, 0                   ; specify return code for OS (zero tells OS everything went fine)
    int 80h                      ; interrupt kernel to perform system call (to exit)

Notice how the program sets up the write system call, sys_write, and then specifies the file descriptor of where to write, being stdout, the string to write, and so on.

In other words, the program itself does not perform the write operation; it sets things up and asks the kernel to do it on its behalf by using a special interrupt, int 80h.

A possible analogy here might be when you go to a restaurant. The server will take your order, but the chef is the one that will do the cooking. In this analogy, you are the user-level application, the server taking your food order is the system call, and the chef in the kitchen is the OS kernel.

If the executable generated from the gcc is in pure machine understandable form then how do the privileged and non-privileged mode work?

Keying off from the previous section, user level programs always run in user mode. When the program needs access to something (e.g. terminal, read a file, etc.), it sets things up, as with the sys_write example above, and asks the kernel to do it on its behalf with an interrupt. The interrupt causes the program to go into kernel mode and remains there until the kernel has completed servicing the client's request --which may include denying it altogether (e.g. trying to read a file the user has no privilege to read).

Internally, it's the system call that's responsible for issuing the int 80h instruction. User-level applications just see the system call, which is the common interface between the client and the OS.

How does the kernel manage the permission of hardware resources when a program can directly run on hardware not through the kernel?

If you followed the previous explanations, you can now see that the kernel acts as a gatekeeper and that programs "knock" on this gate by using the int 80h instruction.

Whilst the program is in machine code, to do anything not within its own memory region, it will need to call the kernel via syscalls.

The CPU actually has a notion of the privilege of code. Unprivileged code can't directly access physical memory etc; it has to go through the OS and ask it to give access.

Hence, every program does directly run on the CPU, but that doesn't mean it can do anything with the hardware – there's hardware measurements against that. The privilege that you'd need to do certain things is one of these.

how can kernel handle the resource allocation to this program

The kernel provides functions and mechanisms to allocate memory, do I/O (write to the screen, interact with the network/sound card), etc., called system calls to user programs. These system calls are the interface between kernel and user programs, alas between the hardware and user programs.

how do the privileged and non-privileged mode work?

User programs are in Unprivileged Mode (userspace) while the kernel runs in Privileged Mode (kernelspace). The user can't be trusted, so if he messes up (accesses higher-privileged memory or dereferences a null pointer, for example) he's prevented from it (by a segmentation fault and the following termination of the program, for example).

The kernel on the other hand runs in Privileged Mode. It can do whatever it wants: write to userspace programs, steal data (like passwords) from user programs, write to the processor's firmware - everything. Furthermore, there are different kinds of kernels: monolithic kernels and microkernels are the heaviliest (does that word exist at all?) used ones.

Linux (initiated by Linus Torvalds) is an example for a monolithic kernel. Here, the kernel is one big system, where every piece of kernel code has ultimate access to the system.

Minix (initiated by Andrew S. Tanenbaum) is an example for a microkernel. The part, which can access everything is rather small. It contains only the functionality that has to be privileged (managing the MMU, accessing hardware), etc. Other functionality, like filesystems, run in the Unprivileged Mode, where they are protected from possible bugs by the usual protection mechanisms employed in userspace (Unprivileged Mode), like Segmentation Faults.

An interesting read concerning the benefits/drawbacks of monolithic kernels and microkernels is the debate between Linus Torvalds (at that time some guy who created an OS) and Andrew S. Tanenbaum (at that time an established professor for CS; has written some amazing books, BTW).

a program can directly run on hardware not through the kernel

It indeed runs directly on the hardware, executed by the CPU. It cannot access certain resources directly, though, like memory, and, in order to access these resources, is required to interact with the kernel. That's one of the major improvements (next to maybe virtual processors, that is, processes) over earlier OSes like DOS: userspace programs cannot run directly on the hardware. If they could, they possibly could mess up the whole machine with irreparable causes (intentionally - like viruses, or unintentionally). Instead, as mentioned in the beginning of this answer, system calls are used.

In DOS you had the option to use routines provided by the OS (commonly a trap at IV (Interrupt Vector, the offset (and physical memory address) into the Real Mode IDT (Interrupt Descriptor Table)) 0x21 (invoked via int 0x21/int 21h), while ax contained a function number identifying the call to the system¹). Roughly the same mechanisms as nowadays where available but not strictly enforced. One could overwrite the whole OS, replace it with one's own program and destroy the machine (load random values into CMOS registers, for example). One could also just use the BIOS-provided routines, bypassing the OS.

¹ I use "call to the system" instead of "system call" intentionally here. Here, system calls only denote the requests from userspace to kernelspace to do something for it. As DOS (i.e., Real Mode) didn't provide a real distinction between userspace and kernelspace, it doesn't really have system calls.

So my first question is if a C program can directly run on the hardware how can kernel handle the resource allocation to this program.

CPUs carry a notion of privileges when executing code. For example, on x86 there is a Real Mode where code is allowed to access any resource, and a Protected Mode where code executes in distinct security rings. Most Operating Systems will switch to Protected Mode, where numerically lower rings imply higher privileges.

The kernel typically executes in Ring 0, which gives direct access to the hardware, while user programs run in Ring 3 which restricts access. When the user program needs to access a privileged resource, the CPU calls into the Operating System, which is privileged, either implicitly or directly through a system call instruction (e.g. syscall in x86-64 assembly).

If the executable generated from the gcc is in pure machine understandable form then how do the privileged and non-privileged mode work?

Again, things like memory access are checked by the CPU. So for example if a program tries to access a virtual address that it doesn't have permission for, the Operating System catches the invalid page access and generally signals the process (i.e. SIGSEGV).

How does the kernel manage the permission of hardware resources when a program can directly run on hardware not through the kernel?

The CPU has to interact directly with the Operating System through specific control registers and tables. For example, the address of the virtual address page table is stored in the CR3 register for x86.

来源：https://stackoverflow.com/questions/33265239/c-and-resource-protection-in-memory

标签

operating-system

system-calls

kernel-mode

usermode