I read about usage of C volatile keyword in memory-mapped hardware register, ISR, and multithreaded program.
1) register
uint8_t volat
Is it because a compiler by-design has no idea of "asynchronous call" (in case of ISR), or multithreading? But this can't be, right?
Yes, it is that way.
In C the compiler has no notion of concurrency, so it is allowed to reorder and cache memory accesses, as long as the view from a single thread can't notice the difference.
That's why you need volatile (block this kind of optimizations for a variable), memory barriers (block it at a single point of the program for all variables) or other forms of synchronization such as locking (which typically act as memory barriers).