How to calculate optimal batch size

前端 未结 5 1943
没有蜡笔的小新
没有蜡笔的小新 2020-12-02 13:30

Sometimes I run into a problem:

OOM when allocating tensor with shape

e.q.

OOM when allocating tensor wi

5条回答
  •  时光取名叫无心
    2020-12-02 14:04

    Use the summaries provided by pytorchsummary (pip install) or keras (builtin).

    E.g.

    from torchsummary import summary
    summary(model)
    .....
    .....
    ================================================================
    Total params: 1,127,495
    Trainable params: 1,127,495
    Non-trainable params: 0
    ----------------------------------------------------------------
    Input size (MB): 0.02
    Forward/backward pass size (MB): 13.93
    Params size (MB): 4.30
    Estimated Total Size (MB): 18.25
    ----------------------------------------------------------------
    

    Each instance you put in the batch will require a full forward/backward pass in memory, your model you only need once. People seem to prefer batch sizes of powers of two, probably because of automatic layout optimization on the gpu.

    Don't forget to linearly increase your learning rate when increasing the batch size.

    Let's assume we have a Tesla P100 at hand with 16 GB memory.

    (16000 - model_size) / (forward_back_ward_size)
    (16000 - 4.3) / 18.25 = 1148.29
    rounded to powers of 2 results in batch size 1024
    

提交回复
热议问题