问题
I have written a TensorFlow / Keras Super-Resolution GAN. I've converted the resulting trained .h5
model to a .tflite
model, using the below code, executed in Google Colab:
import tensorflow as tf
model = tf.keras.models.load_model('/content/drive/My Drive/srgan/output/srgan.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.post_training_quantize=True
tflite_model = converter.convert()
open("/content/drive/My Drive/srgan/output/converted_model_quantized.tflite", "wb").write(tflite_model)
As you can see I use converter.post_training_quantize=True
which was censed to help to output a lighter .tflite
model than the size of my original .h5
model, which is 159MB. The resulting .tflite
model is still 159MB however.
It's so big that I can't upload it to Google Firebase Machine Learning Kit's servers in the Google Firebase Console.
How could I either:
decrease the size of the current
.tflite
model which is 159MB (for example using a tool),or after having deleted the current
.tflite
model which is 159MB, convert the.h5
model to a lighter.tflite
model (for example using a tool)?
Related questions
How to decrease size of .tflite which I converted from keras: no answer, but a comment telling to use converter.post_training_quantize=True
. However, as I explained it, this solution doesn't seem to work in my case.
回答1:
In general, quantization means, shifting from dtype float32 to uint8. So theoretically our model should reduce by the size of 4. This will be clearly visible in files of greater size.
Check whether your model has been quantized or not by using the tool "https://lutzroeder.github.io/netron/". Here you have to load the model and check the random layers having weight.The quantized graph contains the weights value in uint8 format
In unquantized graph the weights value will be in float32 format.
Only setting "converter.post_training_quantize=True" is not enough to quantize your model. The other settings include:
converter.inference_type=tf.uint8
converter.default_ranges_stats=[min_value,max_value]
converter.quantized_input_stats={"name_of_the_input_layer_for_your_model":[mean,std]}
Hoping you are dealing with images.
min_value=0, max_value=255, mean=128(subjective) and std=128(subjective).
name_of_the_input_layer_for_your_model= first name of the graph when you load your model in the above mentioned link or you can get the name of the input layer through the code "model.input" will give the output "tf.Tensor 'input_1:0' shape=(?, 224, 224, 3) dtype=float32". Here the input_1 is the name of the input layer(NOTE: model must include the graph configuration and the weight.)
来源:https://stackoverflow.com/questions/57631313/i-use-tfliteconvert-post-training-quantize-true-but-my-model-is-still-too-big-fo