How to calculate optimal batch size

前端未结

关注

 5  1943

没有蜡笔的小新 2020-12-02 13:30

Sometimes I run into a problem:

OOM when allocating tensor with shape

e.q.

OOM when allocating tensor wi

5条回答

时光取名叫无心 (楼主)

2020-12-02 14:04

Use the summaries provided by pytorchsummary (pip install) or keras (builtin).

E.g.

from torchsummary import summary
summary(model)
.....
.....
================================================================
Total params: 1,127,495
Trainable params: 1,127,495
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.02
Forward/backward pass size (MB): 13.93
Params size (MB): 4.30
Estimated Total Size (MB): 18.25
----------------------------------------------------------------

Each instance you put in the batch will require a full forward/backward pass in memory, your model you only need once. People seem to prefer batch sizes of powers of two, probably because of automatic layout optimization on the gpu.

Don't forget to linearly increase your learning rate when increasing the batch size.

Let's assume we have a Tesla P100 at hand with 16 GB memory.

(16000 - model_size) / (forward_back_ward_size)
(16000 - 4.3) / 18.25 = 1148.29
rounded to powers of 2 results in batch size 1024

0 讨论(0)

查看其它5个回答