Since syscall emulation is much easier to setup, I\'m wondering what are the advantages of using the full system emulation when running an userland program.
Or in ot
Advantages of SE:
sometimes easier to setup benchmarks, if all syscalls you need are implemented (see also, see also), and if you have just the right cross compiler, which of course no one has documented properly which one that is.
SE runs Dhrystone about 2x https://github.com/cirosantilli/linux-kernel-module-cheat/tree/00d282d912173b72c63c0a2cc893a97d45498da5#user-mode-vs-full-system-benchmark That benchmark makes no syscalls (except for information before / after the actual benchmark runs)
it is easier to get greater visibility and control of what the application is doing since the kernel is not running in parallel. E.g. stats will be just for the application, GDB will be just for the application: thread-aware gdb for the Linux kernel
Disadvantages of SE:
in practice, harder to setup benchmarks, because it is too fragile / has too many restrictions.
If your content does not work immediately out of the box, it is easier to just create or download a full system image and go for that instead, which is much more reliable.
Here is a sample minimal working Ubuntu setup if you are still interested: How to compile and run an executable in gem5 syscall emulation mode with se.py?
less representative, since no actual OS is running
no dynamic linking for ARM as of June 2018: How to run a dynamically linked executable syscall emulation mode se.py in gem5?
if you want to evaluate an accelerator like a GPU, you will have to create some slightly custom interface for it, since there is no kernel driver running on top the the kernel as usual.
Brandon has pointed out in his answer that this has in fact been done before: https://stackoverflow.com/a/56371006/9160762
So my recommendation is:
try SE first. If it works, great. If it doesn't, try to fix it quickly, since most problems are trivial. Having the SE setup will save you a lot of time over full system, and it is often representative enough.
otherwise, use FS mode. It is just simpler to setup, more representative, and the performance hit is acceptable for most.
You could also use SE first, and then go to FS to further validate only your most important SE results, since FS is slower and you can therefore validate less different setups.