Because (not all) OS's use the whole return value. In this case, it's truncated to the low 8 bits. It's up to the OS (and other related components, such as the shell) to "use" and "retain" this value, and it's what's called an "implementation detail", in other words, the C and C++ standards do not dicate what the usefulness or meaning of the return value is [aside from the paragraph below] - just that from the C perspective it is an int - the C program may be started in an environment where that value is ignored, truncated, extended or multiplied by 432 and it's still a valid C or C++ environment.
The C standard says that the value 0 or EXIT_SUCCESS (which should have the value zero as I understand it) and the value EXIT_FAILURE (some non-zero value) should be considered success and failure respectively. However, all other values are "implementation defined" and thus under the rules of the OS/environment that the execution happens in.
Note that when the shell (or whatever runs your program), it does not jump straight to your main, but some other functionality is performed first to initialize things that are needed by your main function. And once main has returned/exited, there is usually some code that executes AFTER your program as well. Exactly how this works is dependent on several things:
- who wrote the compiler and/or runtime library
- what OS it is designed for
- what processor architecture it is
- potentially what shell/runtime/OS functionality started the process
The C and C++ standard doesn't define these things, as doing so would potentially affect the actual platforms that can/will run C and/or C++ applications, and the goal of C and C++ is to "be inclusive" - in other words, try to not limit the environment, processor, etc that the languages support.