I can understand this requirement for the old PPC RISC systems and even for x86-64, but for the old tried-and-true x86? In this case, the stack needs to be aligned on 4 byte
In order to maintain consistency in kernel. This allows the same kernel to be booted on multiple architectures without modicfication.