Thanks Sep, glad it helped you.

1 min readJan 6, 2020

You need to write quantization function during inference, to quantize the float32 images datatype to int8 before you feed it to CNN, conversion like

`input_data = input_data.astype(np.int8)`

won’t be useful. One of the ways is to use, tf.image.convert_image_dtype. Tensorflow 8-bit quantization is explained here in this paper.

Written by Milind Deore