Multi-threaded GEMM slower than single threaded one?

后端 未结 3 1625
半阙折子戏
半阙折子戏 2020-12-21 02:45

I wrote some Naiive GEMM code and I am wondering why it is much slower than the equivalent single threaded GEMM code.

With a 200x200 matrix, Single Threaded: 7ms, Mu

3条回答
  •  自闭症患者
    2020-12-21 03:37

    Multi threading means always synchronization, context switching, function call. This all adds up and costs CPU cycles, you can spend on the main task itself.

    If you have just a third nested loop, you save all these steps and can do the computation inline instead of a subroutine, where you must setup a stack, call into, switch to a different thread, return the result and switch back to the main thread.

    Multi threading is useful only, if these costs are small compared to the main task. I guess, you will see better results with multi threading, when the matrix is larger than just 200x200.

提交回复
热议问题