modeltc / dipoorlet Goto Github PK

Offline Quantization Tools for Deploy.

License: Apache License 2.0

Python 100.00%

dipoorlet's Introduction

Introduction

Dipoorlet is an offline quantization tool that can perform offline quantization on ONNX model on a given calibration dataset:

Support several Activation Calibration algorithms: Mse, Minmax, Hist, etc.
Support Weight Transformation to achieve better quantization results: BiasCorrection, WeightEqualization, etc.
Supports SOTA offline finetune algorithms to improve quantization performance: Adaround, Brecq, Qdrop.
Generate Quantitative Parameters required for several platforms: SNP, TensorRT, STPU, ATLAS, etc.
Provide detailed Quantitative Analysis to facilitate the identification of accuracy bottlenecks in model quantization.

Installation

git clone https://github.com/ModelTC/Dipoorlet.git
cd Dipoorlet
python setup.py install

Environment

CUDA

Project using ONNXRuntime as inference runtime, using Pytorch as training tool, so users have to carefully make CUDA and CUDNN version right in order to make this two runtime work.

For example:
ONNXRuntime==1.10.0 and Pytorch==1.10.0-1.13.0 can runs under CUDA==11.4 CUDNN==8.2.4

Please visit ONNXRuntime and Pytorch.

Docker

ONNXRuntime has bug when running in docker when cpu-sets is set. Please check issue

Usage

Prepare Calibration Dataset

The pre processed calibration data needs to be prepared and provided in a specific path form. For example, the model has two input tensors called "input_0" and "input_1", and the file structure is as follows:

cali_data_dir
|
├──input_0
│     ├──0.bin
│     ├──1.bin
│     ├──...
│     └──N-1.bin
└──input_1
      ├──0.bin
      ├──1.bin
      ├──...
      └──N-1.bin

Running Dipoorlet in Pytorch Distributed Environment

python -m torch.distributed.launch --use_env -m dipoorlet -M MODEL_PATH -I INPUT_PATH -N PIC_NUM -A [mse, hist, minmax] -D [trt, snpe, rv, atlas, ti, stpu] [--bc] [--adaround] [--brecq] [--drop]

Running Dipoorlet in Cluster Environment

python -m dipoorlet -M MODEL_PATH -I INPUT_PATH -N PIC_NUM -A [mse, hist, minmax] -D [trt, snpe, rv, atlas, ti, stpu] [--bc] [--adaround] [--brecq] [--drop] [--slurm | --mpirun]

Optional

Using -M to specify ONNX model path.
Using -A to select activation statistic algorithm, minmax, hist, mse.
Using -D to select deploy platform, trt, snpe, rv, ti...
Using -N to specify number of calibration pics.
Using -I to specify path of calibration pics.
Using -O to specify output path.
For hist and kl:
--bins specify histogram bins.
--threshold specify histogram threshold for hist algorithm.
Using --bc to do Bias Correction algorithm.
Using --we to do weight equalization.
Using --adaround to do offline finetune by Adaround.
Using --brecq to do offline finetune by Brecq.
Using --brecq --drop to do offline finetune by Qdrop.
Using --skip_layers to skip quantization of some layers.
Using --slurm to launch task from slurm.
Other usage can get by "python -m dipoorlet --h/-help"

Example

Quantify an onnx model model.onnx, save 100 calibration data in workdir/data/, where "data" represents the name of the onnx model. Use “minmax“ activation value calibration algorithm, use “Qdrop“ to perform unlabeled fine tuning on weights, and finally generate TensorRT quantization configuration information:

Calibration Data Path

workdir
|
├──data
    ├──0.bin
    ├──1.bin
    ├──...
    └──99.bin

Command

python -m torch.distributed.launch --use_env -m dipoorlet -M model.onnx -I workdir/ -N 100 -A minmax -D trt

dipoorlet's People

Contributors

Stargazers

Watchers

Forkers

lliai jie311 shiwenloong ai-jie01 hangeramber gushiqiao wenqian1228 grok-phantom wq-dd chenjun2hao yanwang202199 helloyongyang railcalibur

dipoorlet's Issues

针对rv1126芯片量化mobilenet-0.25 ，得到的quant_model.onnx，使用rknntoolkit-1.7.1加载失败

按照example里关于rv平台的量化示例，对mobilenet模型进行量化，能正常的到量化后的onnx模型，但是用rknn-toolkit转换失败

看报错信息，应该是官方不支持gemm量化后的算子，我查看了官方rknntoolkit仓库里关于加载量化模型的示例，发现瑞芯微官方提供的shufflenet模型最后的gemm前后确实也没有加quant/dequant op
https://github.com/rockchip-linux/rknn-toolkit/tree/master/examples/common_function_demos/load_quantized_model/onnx

quant_model.zip
mobilenet.zip

如何支持新平台，比如地平线芯片？

dipoorlet 支持动态输入吗

在试用dipoorlet PTQ量化 torch 导出的onnx模型时报错: ValueError: cannot reshape array of size 172800 into shape (0,0,3,180,320)。

torch.onnx.export() 导出时指定了dynamic_axes, 具体如下：

torch.onnx.export(
        model, # torch model
        dummy_input, # random dummy input
        onnx_path, # save path of onnx format model
        export_params=True, # export all params
        verbose=True, # enable debug message
        training=torch.onnx.TrainingMode.EVAL, # export the model in inference mode
        input_names=input_names, # names to assign to input nodes of computation graph
        output_names=output_names, # names to assign to output nodes of computation graph
        opset_version=16, # version of opset
        # dynamic axes setting for dynamic input/output shapes
        dynamic_axes={
            "LR_bins":{0: "batch_size", 1:"temporal_dim"},
            "HR":{0: "batch_size", 1:"temporal_dim"}
        }

使用dipoorlet量化时具体报错如下：

root@autodl-container-032d11993c-d711a821:~/autodl-tmp/Dipoorlet_Examples# sh verification_trial.sh
[2023-11-07 14:59:14 dipoorlet](__main__.py 118): INFO Do tensor calibration...
Minmax update: 0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/autodl-tmp/Dipoorlet/dipoorlet/__main__.py", line 119, in <module>
    act_clip_val, weight_clip_val = tensor_calibration(onnx_graph, args)
  File "/root/autodl-tmp/Dipoorlet/dipoorlet/tensor_cali/tensor_cali_base.py", line 6, in tensor_calibration
    act_clip_val = tensor_cali_dispatcher(args.act_quant, onnx_graph, args)
  File "/root/autodl-tmp/Dipoorlet/dipoorlet/utils.py", line 297, in wrapper
    return dispatch(args[0])(*(args[1:]), **kw)
  File "/root/autodl-tmp/Dipoorlet/dipoorlet/tensor_cali/basic_algorithm.py", line 18, in find_clip_val_minmax
    stats_min_max = forward_get_minmax(onnx_graph, args)
  File "/root/autodl-tmp/Dipoorlet/dipoorlet/forward_net.py", line 215, in forward_get_minmax
    ort_inputs[name] = data[name][:].reshape(onnx_graph.get_tensor_shape(name))
ValueError: cannot reshape array of size 172800 into shape (0,0,3,180,320)

关于MSE准则迭代求解最优scale的疑问

你好，在函数forward_net_octav中有如下mse准则下迭代求解最优scale的代码：

abs_x = np.abs(ort_inputs[i])
s_n = abs_x.sum() / abs_x[abs_x > 0].size
for _ in range(20):
    s_n_plus_1 = abs_x[abs_x > s_n].sum() / \
               (1 / (4 ** 8) / 3 / unsigned * abs_x[abs_x <= s_n].size + abs_x[abs_x > s_n].size)
    if np.abs(s_n_plus_1 - s_n) < 1e-6:
        break
    s_n = s_n_plus_1

想请问下这里

 s_n_plus_1 = abs_x[abs_x > s_n].sum() / \
               (1 / (4 ** 8) / 3 / unsigned * abs_x[abs_x <= s_n].size + abs_x[abs_x > s_n].size)

迭代更新scale公式的物理含义是什么呢？是如何推导得到的呢？

校准数据

你好，请问校准数据是什么格式，就是0.bin的格式，需要用图片转换到特定格式吗？

精度损失大

你好，使用默认minmax校准量化转换后的模型，目标检测mAP降低10个点，这是为什么

如何解决"CUDA out of memory"

我的硬件是单张RTX 3050，使用指令
python -m torch.distributed.launch --use_env -m dipoorlet -I dipoorlet_work_dir/ -N 1000 -D trt -M models/mobilev2_model.onnx -A mse -O dipoorlet_brecq/ --brecq
执行模型量化，产生了CUDA out of memory的运行报错。我检查了所有可以使用的命令行参数，没有发现可以调整数据加载批次的命令，请问有什么手段可以消除这个报错吗？

Run adaround failed with a toy onnx model

when i try to quant a model with adaround. But below error occurs:

onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(uint8)' of input parameter (input) of operator (QuantizeLinear) in node (QuantizeLinear0) is invalid.

How to set platform_settings while expand new backend?

Hi,
Could you please provide the complete platform settings that can be modified?

I want to expand a new backend to deploy model, and need know that what kind of valid arguments that i can set.

Thanks