I know that in theory, the loss of a network over a batch is just the sum of all the individual losses. This is reflected in the Keras code for calculating total loss. Relev
I would like to summarize the brilliant answers in this page.
In the loss history printed by model.fit, the loss value printed is a running average on each batch. So the value we see is actually a estimated loss scaled for batch_size*per datapoint.
Be aware that even if we set batch size=1, the printed history may use a different batch interval for print. In my case:
self.model.fit(x=np.array(single_day_piece),y=np.array(single_day_reward),batch_size=1)
The print is:
1/24 [>.............................] - ETA: 0s - loss: 4.1276
5/24 [=====>........................] - ETA: 0s - loss: -2.0592
9/24 [==========>...................] - ETA: 0s - loss: -2.6107
13/24 [===============>..............] - ETA: 0s - loss: -0.4840
17/24 [====================>.........] - ETA: 0s - loss: -1.8741
21/24 [=========================>....] - ETA: 0s - loss: -2.4558
24/24 [==============================] - 0s 16ms/step - loss: -2.1474
In my problem, there is no way a single datapoint loss could reach scale of 4.xxx.So I guess model take sum loss of first 4 datapoints. However,the batch size for tain is not 4.