Why is my GPU slower than CPU when training LSTM/RNN models?

前端 未结 4 1679
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-29 02:52

My machine has the following spec:

CPU: Xeon E5-1620 v4

GPU: Titan X (Pascal)

Ubuntu 16.04

Nvidia driver 375.26

CUDA tookit 8.0

4条回答
  •  醉话见心
    2020-12-29 03:12

    I have got similar issues here:

    Test 1

    CPU: Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

    Ubuntu 14.04

    imdb_bidirectional_lstm.py: 155s

    Test 2

    GPU: GTX 860m

    Nvidia Driver: 369.30

    CUDA Toolkit: v8.0

    cuDNN: v6.0

    imdb_bidirectional_lstm.py:450s

    Analyse

    When I observe the GPU load curve, I found one interesting thing:

    • for lstm, GPU load jumps quickly between ~80% and ~10%

    GPU load

    This is mainly due to the sequential computation in LSTM layer. Remember that LSTM requires sequential input to calculate hidden layer weights iteratively, in other words, you must wait for hidden state at time t-1 to calculate hidden state at time t.

    That's not a good idea for GPU cores, since they are many small cores who like doing computations in parallel, sequential compuatation can't fully utilize their computing powers. That's why we are seeing GPU load around 10% - 20% most of the time.

    But in the phase of backpropagation, GPU could run derivative computation in parallel, so we can see GPU load peak around 80%.

提交回复
热议问题