I use TFLiteConvert post_training_quantize=True but my model is still too big for being hosted in Firebase ML Kit's Custom servers

倖福魔咒の 提交于 2020-01-04 06:41:27

问题


I have written a TensorFlow / Keras Super-Resolution GAN. I've converted the resulting trained .h5 model to a .tflite model, using the below code, executed in Google Colab:

import tensorflow as tf
model = tf.keras.models.load_model('/content/drive/My Drive/srgan/output/srgan.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.post_training_quantize=True
tflite_model = converter.convert()
open("/content/drive/My Drive/srgan/output/converted_model_quantized.tflite", "wb").write(tflite_model)

As you can see I use converter.post_training_quantize=True which was censed to help to output a lighter .tflite model than the size of my original .h5 model, which is 159MB. The resulting .tflite model is still 159MB however.

It's so big that I can't upload it to Google Firebase Machine Learning Kit's servers in the Google Firebase Console.

How could I either:

  • decrease the size of the current .tflite model which is 159MB (for example using a tool),

  • or after having deleted the current .tflite model which is 159MB, convert the .h5 model to a lighter .tflite model (for example using a tool)?

Related questions

How to decrease size of .tflite which I converted from keras: no answer, but a comment telling to use converter.post_training_quantize=True. However, as I explained it, this solution doesn't seem to work in my case.


回答1:


In general, quantization means, shifting from dtype float32 to uint8. So theoretically our model should reduce by the size of 4. This will be clearly visible in files of greater size.

Check whether your model has been quantized or not by using the tool "https://lutzroeder.github.io/netron/". Here you have to load the model and check the random layers having weight.The quantized graph contains the weights value in uint8 format In unquantized graph the weights value will be in float32 format.

Only setting "converter.post_training_quantize=True" is not enough to quantize your model. The other settings include:
converter.inference_type=tf.uint8
converter.default_ranges_stats=[min_value,max_value]
converter.quantized_input_stats={"name_of_the_input_layer_for_your_model":[mean,std]}

Hoping you are dealing with images.
min_value=0, max_value=255, mean=128(subjective) and std=128(subjective).
name_of_the_input_layer_for_your_model= first name of the graph when you load your model in the above mentioned link or you can get the name of the input layer through the code "model.input" will give the output "tf.Tensor 'input_1:0' shape=(?, 224, 224, 3) dtype=float32". Here the input_1 is the name of the input layer(NOTE: model must include the graph configuration and the weight.)



来源:https://stackoverflow.com/questions/57631313/i-use-tfliteconvert-post-training-quantize-true-but-my-model-is-still-too-big-fo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!