Comments (9)
The result scale and zero_point are not to be inferred from the inputs scale and zero point, that 's why neither our example code nor paper give formulas for that. There is no formula for that.
Instead, the quantization parameters of the result must be given by the user.
In a typical quantized neural network application, as in our paper, it is the training process that will record the min-max used for each matrix, including for the result matrix. The quantization and inference process will then use that pre-recorded min-max to quantize the result matrix.
from gemmlowp.
maybe out paper gives more context.
https://arxiv.org/abs/1712.05877
from gemmlowp.
@bjacob
Hi Benoit,
I read the paper you mentioned, but I still have the same question.
result_quantized_value = result_zero_point + (lhs_scale * rhs_scale / result_scale) * Sum_over_i( (lhs_quantized_value[i] - lhs_zero_point) * (rhs_quantized_value[i] - rhs_zero_point) ) (5)
The above equation is the basic scheme to calculate the quantized matrix multiplication. Since the input matrices are given, lhs_scale*rhs_scale,
and Sum_over
parts are easy to compute. But how to calculate result_scale
and result_zero_point
is not well described in both paper and gemmlowp documents.
Assume the result quantized value has 8 bits, my guess is
255 = result_quantized_value_max = result_zero_point + (lhs_scale * rhs_scale / result_scale) *Sum_over_i_max (a)
and
0 = result_quantized_value_min = result_zero_point + (lhs_scale * rhs_scale / result_scale) *Sum_over_i_min (b)
(a) -(b), we can get:
255 = (lhs_scale * rhs_scale / result_scale) *(Sum_over_i_max - Sum_over_i_min) (c)
Then,
result_scale = (lhs_scale * rhs_scale / 255) *(Sum_over_i_max - Sum_over_i_min)
Since Sum_over_i_max
and Sum_over_i_min
can be calculated, the result_scale can be got from the above equation. Is it correct and is it the way you used for calculating the result_scale and result_zero_point? Thank you so much.
from gemmlowp.
@bjacob
Thanks Benoit, is there any pretrained quantized model such as mobilenet that contains scales and zeropoints?
from gemmlowp.
I think there is, explore around
https://www.tensorflow.org/mobile/tflite/
and maybe ask on the issue tracker there if it's not obvious.
from gemmlowp.
@bjacob hello
From your paper https://arxiv.org/abs/1712.05877 I get that
During the training with simulated quantization, you only quantized the weights and activations, so we can get the corresponding scale and zero_point
(1)could you tell me how to get the result scale and zero_point during training process?
Is it right that to inference the model without to be quantized and collect [a; b] ranges about the result and deal with it just like deal the activations during the Training with simulated quantization?
you said that "The quantization and inference process will then use that pre-recorded min-max to quantize the result matrix."
(2)How to ensure that the quantized model with pre-recorded min-max has generalization ability?
thanks a lot, good luck to you @bjacob
from gemmlowp.
Redirecting these questions to @skligys who wrote Section 3 of this paper on training and is generally the training expert :-)
from gemmlowp.
same question,i have trained a quantized model in tf object object API, but when i get the global variables in the ".ckpt", i only found the weight_min/max and the min/max after relu6 (0 /5.9997) ,there is not output min/max of conv ,why?
the name of min/max tensor like that:
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/min:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/max:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/min/biased:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/min/local_step:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/max/biased:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/act_quant/max/local_step:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/weights_quant/min:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/weights_quant/max:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/act_quant/min:0
FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/act_quant/max:0
from gemmlowp.
from gemmlowp.
Related Issues (20)
- eight_bit_int_gemm get all zero output and segment fault HOT 4
- A problem with the design of kernel in Arm64 HOT 1
- error result when W and X don't range from -1 to 1 HOT 2
- Add two feature maps HOT 1
- Is this product range of int8*int8 in comment document expected? HOT 3
- run dotprod instruction failed on apple A12 and qualcomm 845 HOT 1
- Error compilation for Windows x64 using MingGW 64 HOT 1
- Python 3 support HOT 2
- How can I use a new gemm-kernel in tensorflow or other machine learning framework? HOT 1
- How to quantize accumulator from int32 to uint8 HOT 4
- Quantize matmul in CPU avx2 have effect?
- no such package '@remotejdk_linux//':
- what is "ab_x2_high32" in <func::SaturatingRoundingDoublingHighMul> stand for?
- How to calculate non-linear function after 8bit quantization ? HOT 1
- SIMD back-end for IBM Power and Z HOT 3
- Issues compiling for bare metal application HOT 2
- How to use gemmlowp in C project?
- int8*int8 -> float? HOT 3
- is D in your fomation (2) calculated by this?
- Suggestions for resources to understand gemmlowp HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gemmlowp.