Register variables are a well-known way to get fast access (register int i). But why are registers on the top of hierarchy (registers, cache, main memory, secondary
Every microcontroller has a CPU as Bill mentioned, that has the basic components of ALU, some RAM as well as other forms of memory to assist with its operations. The RAM is what you are referring to as Main memory.
The ALU handles all of the arthimetic logical operations and to operate on any operands to perform these calculations, it loads the operands into registers, performs the operations on these, and then your program accesses the stored result in these registers directly or indirectly.
Since registers are closest to the heart of the CPU (a.k.a the brain of your processor), they are higher up in the chain and ofcourse operations performed directly on registers take the least amount of clock cycles.