问题

I'm trying to create machinelearing in python 3. but then i trying to compile my code i got this error in Cuda 10.0/cuDNN 7.5.0, can some one help me with this?

RTX 2080

I'm on: Keras (2.2.4) tf-nightly-gpu (1.14.1.dev20190510)

Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Code erorr: tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

Here is my code:

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(50, 50, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(1, activation='softmax'))

model.summary()

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit(x, y, epochs=1, batch_size=n_batch)

OOM when allocating tensor with shape[24946,32,48,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

回答1:

There are 2 possible solutions.

Problem on allocating GPU Memory

add the following code

import tensorflow as tf
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
config = tf.ConfigProto(gpu_options=gpu_options)
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

check also this issue

Problem with your NVIDIA Driver

As posted there you need to upgrade your NVIDIA Driver using ODE driver.

Please check NVIDIA Documentation for version of the driver

回答2:

Using Tensorflow 2.0, CUDA 10.0 and CUDNN 7.5 the following worked for me:

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

There are some other answers (such as the one here by venergiac) that use outdated Tensorflow 1.x syntax. If you are using the latest tensorflow you'll need to use the code I gave here.

If you get the following error:

Physical devices cannot be modified after being initialized

then the problem will be resolved by putting the gpus = tf.config ... lines directly below where you import tensorflow, i.e.

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

回答3:

Roko's answer should work if you're using Tensorflow 2.0.

If you wanna set the exact amount of memory to limit(e.g. 1024MB or 2GB, etc.), there's another way to restrict your GPU memory usage.

Use this code:

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
  except RuntimeError as e:
    print(e)

This code will limit your 1st GPU’s memory usage up to 1024MB. Just change the index of gpus and memory_limit as you want.

回答4:

Try this

for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

来源：https://stackoverflow.com/questions/56008683/could-not-create-cudnn-handle-cudnn-status-internal-error

标签

algorithm

cudnn

Could not create cudnn handle: CUDNN STATUS INTERNAL ERROR