问题
I'm trying to train a model using keras \ tensorflow (1.4) on a p3.2xlarge aws machine (which has a NVIDIA Tesla V100 GPU) two parts of the initialisation are very slow when using a GPU, but run in a reasonable time on CPU
The first part is "calling" an embedding layer during model setup
network = embedding(input)
this embedding layer is used several times, but only the 1st time is slow it appears that this is the phase that the weights are copied to the GPU, and it takes a few minuets (~5) for a 400000 * 200 weight matrix
the second long part is the call to train_on_batch for the first batch (it takes about 20 minutes)
not sure it's relevant, but according to this post, it might be related to using a custom layer
are there any ways to speed up these parts?
EDIT These parts aren't slow when running the same code on a p2.xlarge aws machine (which has a Tesla K80 GPU)
来源:https://stackoverflow.com/questions/47296197/keras-with-tensorflow-on-gpu-machine-some-parts-are-very-slow