I can understand this requirement for the old PPC RISC systems and even for x86-64, but for the old tried-and-true x86? In this case, the stack needs to be aligned on 4 byte
My guess is that Apple believes everyone just uses XCode (gcc) which aligns the stack for you. So requiring the stack to be aligned so the kernel doesn't have to is just a micro-optimization.