Keras inconsistent prediction time

后端未结

关注

 2  606

盖世英雄少女心 2021-01-01 15:48

I tried to get an estimate of the prediction time of my keras model and realised something strange. Apart from being fairly fast normally, every once in a while the model ne

2条回答

无人及你 (楼主)

2021-01-01 16:38

While I can't explain the inconsistencies in execution time, I can recommend that you try to convert your model to TensorFlow Lite to speed up predictions on single data records or small batches.

I ran a benchmark on this model:

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(384, activation='elu', input_shape=(256,)),
    tf.keras.layers.Dense(384, activation='elu'),
    tf.keras.layers.Dense(256, activation='elu'),
    tf.keras.layers.Dense(128, activation='elu'),
    tf.keras.layers.Dense(32, activation='tanh')
])

The prediction times for single records were:

model.predict(input): 18ms
model(input): 1.3ms
Model converted to TensorFlow Lite: 43us

The time to convert the model was 2 seconds.

The class below shows how to convert and use the model and provides a predict method like the Keras model. Note that it would need to be modified for use with models that don’t just have a single 1-D input and a single 1-D output.

class LiteModel:

    @classmethod
    def from_file(cls, model_path):
        return LiteModel(tf.lite.Interpreter(model_path=model_path))

    @classmethod
    def from_keras_model(cls, kmodel):
        converter = tf.lite.TFLiteConverter.from_keras_model(kmodel)
        tflite_model = converter.convert()
        return LiteModel(tf.lite.Interpreter(model_content=tflite_model))

    def __init__(self, interpreter):
        self.interpreter = interpreter
        self.interpreter.allocate_tensors()
        input_det = self.interpreter.get_input_details()[0]
        output_det = self.interpreter.get_output_details()[0]
        self.input_index = input_det["index"]
        self.output_index = output_det["index"]
        self.input_shape = input_det["shape"]
        self.output_shape = output_det["shape"]
        self.input_dtype = input_det["dtype"]
        self.output_dtype = output_det["dtype"]

    def predict(self, inp):
        inp = inp.astype(self.input_dtype)
        count = inp.shape[0]
        out = np.zeros((count, self.output_shape[1]), dtype=self.output_dtype)
        for i in range(count):
            self.interpreter.set_tensor(self.input_index, inp[i:i+1])
            self.interpreter.invoke()
            out[i] = self.interpreter.get_tensor(self.output_index)[0]
        return out

    def predict_single(self, inp):
        """ Like predict(), but only for a single record. The input data can be a Python list. """
        inp = np.array([inp], dtype=self.input_dtype)
        self.interpreter.set_tensor(self.input_index, inp)
        self.interpreter.invoke()
        out = self.interpreter.get_tensor(self.output_index)
        return out[0]

The complete benchmark code and a plot can be found here: https://medium.com/@micwurm/using-tensorflow-lite-to-speed-up-predictions-a3954886eb98

0 讨论(0)

查看其它2个回答