Reducing TFLite model size?

一世执手 提交于 2021-01-27 12:41:44

问题


I'm currently making a multi-label image classification model by following this guide (it uses inception as the base model): https://towardsdatascience.com/multi-label-image-classification-with-inception-net-cbb2ee538e30

After converting from .pb to .tflite the model is only approximately 0.3mb smaller.

Here is my conversion code:

toco \
  --graph_def_file=optimized_graph.pb \
  --output_file=output/optimized_graph.tflite \
  --output_format=TFLITE \
  --input_shape=1,299,299,3 \
  --input_array=Mul \
  --output_array=final_result \
  --inference_type=FLOAT \
  --inference_input_type=FLOAT

So, I have a couple of questions:

  1. How much should I expect the size to reduce after converting a model to .tflite?
  2. Are there any ways of reducing the size while still being able to convert to a mobile friendly model? If not, I'm guessing I'll need to convert the mobilenet to work with multi-label classification.

回答1:


Okay, so I've found a way to do it. I use the optimized graph (unquantized) and run the following command:

tflite_convert --graph_def_file=optimized_graph.pb \
  --output_file=output/optimized_graph_quantized.tflite \
  --output_format=TFLITE \
  --input_shape=1,299,299,3 \
  --input_array=Mul \
  --output_array=final_result \
  --inference_type=QUANTIZED_UINT8 \
  --std_dev_values=128 --mean_values=128 \
  --default_ranges_min=-6 --default_ranges_max=6 \
  --quantize_weights=true

My main concern with the above is that when I don't specify min/max ranges I get the following message: "Array conv, which is an input to the Conv operator producing the output array conv_1, is lacking min/max data, which is necessary for quantization. Either target a non-quantized output format, or change the input graph to contain min/max information, or pass --default_ranges_min= and --default_ranges_max= if you do not care about the accuracy of results."

I've changed the tf-for-poets android code to allow me to use the quantized tflite graph (basically the reverse of this - https://github.com/tensorflow/tensorflow/issues/14719) and I seem to be getting results that are as good as the original, unquantized graph.




回答2:


I solved the same problem using @ChristopherPaterson solution but removing --quantize_weights=true worked for me. The command is:

tflite_convert --graph_def_file=optimized_graph.pb \
  --output_file=output/optimized_graph_quantized.tflite \
  --output_format=TFLITE \
  --input_shape=1,299,299,3 \
  --input_array=Mul \
  --output_array=final_result \
  --inference_type=QUANTIZED_UINT8 \
  --std_dev_values=128 --mean_values=128 \
  --default_ranges_min=-6 --default_ranges_max=6


来源:https://stackoverflow.com/questions/51502539/reducing-tflite-model-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!