curiously I just found out that my CPU is much faster for predictions. Doing inference with GPU is much slower then with CPU.
I have tf.keras (tf2) NN model with a si