Why is TF Keras inference way slower than Numpy operations?

问题

I'm working on a reinforcement learning model implemented with Keras and Tensorflow. I have to do frequent calls to model.predict() on single inputs.

While testing inference on a simple pretrained model, I noticed that using Keras' model.predict is WAY slower than just using Numpy on stored weights. Why is it that slow and how can I accelerate it? Using pure Numpy is not viable for complex models.

import timeit
import numpy as np
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense

w = np.array([[-1., 1., 0., 0.], [0., 0., -1., 1.]]).T
b = np.array([ 15., -15., -21., 21.])

model = Sequential()
model.add(Dense(4, input_dim=2, activation='linear'))
model.layers[0].set_weights([w.T, b])
model.compile(loss='mse', optimizer='adam')

state = np.array([-23.5, 17.8])

def predict_very_slow():
    return model.predict(state[np.newaxis])[0]

def predict_slow():
    ws = model.layers[0].get_weights()
    return np.matmul(ws[0].T, state) + ws[1]

def predict_fast():
    return np.matmul(w, state) + b

print(
    timeit.timeit(predict_very_slow, number=10000),
    timeit.timeit(predict_slow, number=10000),
    timeit.timeit(predict_fast, number=10000)
)
# 5.168972805004538 1.6963867129435828 0.021918574168087623
# 5.461319456664639 1.5491559107269515 0.021502970783442876

回答1:

A little late, but maybe useful for someone:

Replace model.predict(X) with model.predict(X, batch_size=len(X))

That should do it.

回答2:

Are you running your Keras model (with TensorFlow backend) in a loop? If so, Keras has a memory leak issue identified here: LINK

In this case you have to import the following:

import keras.backend.tensorflow_backend
import tensorflow as tf

from keras.backend import clear_session

Finally, you have to put the following at the end of every iteration of a loop after you're done doing your computations:

clear_session()
if keras.backend.tensorflow_backend._SESSION:
    tf.reset_default_graph()
    keras.backend.tensorflow_backend._SESSION.close()
    keras.backend.tensorflow_backend._SESSION = None

This should help you free up memory at the end of every loop and eventually, make the process faster. I hope this helps.

回答3:

The memory leak issue still seems to persist in Keras. The following lines of code mentioned in that issue did the trick for me:

import ... as K
import gc

model = ....
del model
K.clear_session()
gc.collect()

来源：https://stackoverflow.com/questions/48796619/why-is-tf-keras-inference-way-slower-than-numpy-operations

标签

python

numpy

tensorflow

keras