问题
i am facing issue with my inception model during the performance testing with Apache JMeter.
Error: OOM when allocating tensor with shape[800,1280,3] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: Cast = CastDstT=DT_FLOAT, SrcT=DT_UINT8, _device="/job:localhost/replica:0/task:0/device:GPU:0"]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
回答1:
OOM stands for Out Of Memory. That means that your GPU has run out of space, presumably because you've allocated other tensors which are too large. You can fix this by making your model smaller or reducing your batch size. By the looks of it, you're feeding in a large image (800x1280) you may want to consider downsampling.
回答2:
If you have multiple GPUS at hand, kindly select a GPU which is not as busy as this one, (possible reasons, other processes are also running on this GPU). Go to terminal and type
export CUDA_VISIBLE_DEVICES=1
where 1 is the number of other GPU available. re-run the same code.
you can check the available GPUs using
nvidia-smi
this will show you what GPUs are available and how much memory is available on each one of them
来源:https://stackoverflow.com/questions/50760543/error-oom-when-allocating-tensor-with-shape