Could not create cudnn handle: CUDNN STATUS INTERNAL ERROR

删除回忆录丶 提交于 2020-07-18 19:50:22

问题


I'm trying to create machinelearing in python 3. but then i trying to compile my code i got this error in Cuda 10.0/cuDNN 7.5.0, can some one help me with this?

RTX 2080

I'm on: Keras (2.2.4) tf-nightly-gpu (1.14.1.dev20190510)

Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Code erorr: tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

Here is my code:

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(50, 50, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(1, activation='softmax'))

model.summary()

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit(x, y, epochs=1, batch_size=n_batch)

OOM when allocating tensor with shape[24946,32,48,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc


回答1:


There are 2 possible solutions.

Problem on allocating GPU Memory

add the following code

import tensorflow as tf
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
config = tf.ConfigProto(gpu_options=gpu_options)
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

check also this issue

Problem with your NVIDIA Driver

As posted there you need to upgrade your NVIDIA Driver using ODE driver.

Please check NVIDIA Documentation for version of the driver




回答2:


Using Tensorflow 2.0, CUDA 10.0 and CUDNN 7.5 the following worked for me:

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

There are some other answers (such as the one here by venergiac) that use outdated Tensorflow 1.x syntax. If you are using the latest tensorflow you'll need to use the code I gave here.

If you get the following error:

Physical devices cannot be modified after being initialized

then the problem will be resolved by putting the gpus = tf.config ... lines directly below where you import tensorflow, i.e.

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)



回答3:


Roko's answer should work if you're using Tensorflow 2.0.

If you wanna set the exact amount of memory to limit(e.g. 1024MB or 2GB, etc.), there's another way to restrict your GPU memory usage.

Use this code:

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
  except RuntimeError as e:
    print(e)

This code will limit your 1st GPU’s memory usage up to 1024MB. Just change the index of gpus and memory_limit as you want.




回答4:


Try this

for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])


来源:https://stackoverflow.com/questions/56008683/could-not-create-cudnn-handle-cudnn-status-internal-error

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!