One of the stated reasons for knowing assembler is that, on occasion, it can be employed to write code that will be more performant than writing that code in a higher-level
One of the posibilities to the CP/M-86 version of PolyPascal (sibling to Turbo Pascal) was to replace the "use-bios-to-output-characters-to-the-screen" facility with a machine language routine which in essense was given the x, and y, and the string to put there.
This allowed to update the screen much, much faster than before!
There was room in the binary to embed machine code (a few hundred bytes) and there was other stuff there too, so it was essential to squeeze as much as possible.
It turnes out that since the screen was 80x25 both coordinates could fit in a byte each, so both could fit in a two-byte word. This allowed to do the calculations needed in fewer bytes since a single add could manipulate both values simultaneously.
To my knowledge there is no C compilers which can merge multiple values in a register, do SIMD instructions on them and split them out again later (and I don't think the machine instructions will be shorter anyway).