The problem comes when I looked up Wikipedia page of Matrix multiplication algorithm
It says:
This algorithm has a critical path length of
"Infinite number of processors" is perhaps a poor way of phrasing it.
When people study parallel computation from a theoretical viewpoint, they basically want to ask "assuming I have more processors than I need, how fast can I possibly do it".
It's a well-defined question -- just because you have a huge number of processors doesn't mean matrix multiplication is O(1).
Suppose you take any naive algorithm for matrix multiplication on a single processor. Then I tell you, you can have one processor for every single assembly instruction if you like, so the program can be "parallelized" in that each processor performs only a single instruction and then shares its result with the next.
The time of that computation is not "1" cycle, because some of the processors have to wait for other processors to finish, and those processors are waiting on different processors, etc.
Generally speaking, nontrivial problems (problems in which none of the input bits are irrelevant) require time O(log n)
in parallel computation, otherwise the "answer" processor at the very end doesn't even have time to depend on all of the input bits.
Problems for which O(log n)
parallel time is tight, are said to be highly parallelizable. It is widely conjectured that some of them don't have this property. If that's not true, then in terms of computational complexity theory, P
would collapse to a lower class which it is conjectured not to.