ai-edge-contest-4th's People
ai-edge-contest-4th's Issues
モデルを量子化する
$ $ freeze_graph --help
usage: freeze_graph [-h] [--input_graph INPUT_GRAPH]
[--input_saver INPUT_SAVER]
[--input_checkpoint INPUT_CHECKPOINT]
[--checkpoint_version CHECKPOINT_VERSION]
[--output_graph OUTPUT_GRAPH]
[--input_binary [INPUT_BINARY]]
[--output_node_names OUTPUT_NODE_NAMES]
[--restore_op_name RESTORE_OP_NAME]
[--filename_tensor_name FILENAME_TENSOR_NAME]
[--clear_devices [CLEAR_DEVICES]]
[--initializer_nodes INITIALIZER_NODES]
[--variable_names_whitelist VARIABLE_NAMES_WHITELIST]
[--variable_names_blacklist VARIABLE_NAMES_BLACKLIST]
[--input_meta_graph INPUT_META_GRAPH]
[--input_saved_model_dir INPUT_SAVED_MODEL_DIR]
[--saved_model_tags SAVED_MODEL_TAGS]
optional arguments:
-h, --help show this help message and exit
--input_graph INPUT_GRAPH
TensorFlow 'GraphDef' file to load.
--input_saver INPUT_SAVER
TensorFlow saver file to load.
--input_checkpoint INPUT_CHECKPOINT
TensorFlow variables file to load.
--checkpoint_version CHECKPOINT_VERSION
Tensorflow variable file format
--output_graph OUTPUT_GRAPH
Output 'GraphDef' file name.
--input_binary [INPUT_BINARY]
Whether the input files are in binary format.
--output_node_names OUTPUT_NODE_NAMES
The name of the output nodes, comma separated.
--restore_op_name RESTORE_OP_NAME
The name of the master restore operator. Deprecated,
unused by updated loading code.
--filename_tensor_name FILENAME_TENSOR_NAME
The name of the tensor holding the save path.
Deprecated, unused by updated loading code.
--clear_devices [CLEAR_DEVICES]
Whether to remove device specifications.
--initializer_nodes INITIALIZER_NODES
Comma separated list of initializer nodes to run
before freezing.
--variable_names_whitelist VARIABLE_NAMES_WHITELIST
Comma separated list of variables to convert to
constants. If specified, only those variables will be
converted to constants.
--variable_names_blacklist VARIABLE_NAMES_BLACKLIST
Comma separated list of variables to skip converting
to constants.
--input_meta_graph INPUT_META_GRAPH
TensorFlow 'MetaGraphDef' file to load.
--input_saved_model_dir INPUT_SAVED_MODEL_DIR
Path to the dir with TensorFlow 'SavedModel' file and
variables.
--saved_model_tags SAVED_MODEL_TAGS
Group of tag(s) of the MetaGraphDef to load, in string
format, separated by ','. For tag-set contains
multiple tags, all tags must be passed in.
$ $ vai_q_tensorflow --help
usage:
usage: vai_q_tensorflow command [Options]
examples:
show help : vai_q_tensorflow --help
quantize a model: vai_q_tensorflow quantize --input_frozen_graph frozen_graph.pb --input_nodes xxx --output_nodes yyy --input_shapes zzz --input_fn module.calib_input
inspect a model : vai_q_tensorflow inspect --input_frozen_graph frozen_graph.pb
dump quantized model : vai_q_tensorflow dump --input_frozen_graph quantize_results/quantize_eval_model.pb --input_fn module.dump_input
Xilinx's Quantization Tools Vai_q_tensorflow v1.0.0 Build for Tensorflow
1.12.0
positional arguments:
{quantize,inspect,dump}
Specify a command for vai_q_tensorflow
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--input_frozen_graph INPUT_FROZEN_GRAPH
The path to input frozen graph(.pb) (default: )
--output_dir OUTPUT_DIR
The directory to save the quantization results
(default: ./quantize_results)
--weight_bit WEIGHT_BIT
The target bit width for weights/biases (default: 8)
--activation_bit ACTIVATION_BIT
The target bit width for activation (default: 8)
--method {0,1} The method for quantization, options are: 0: non-
overflow method, make sure no values are saturated
during quantization, may get bad results incase of
outliers. 1: min-diffs method, allow saturation for
large values during quantization to get smaller
quantization errors. This method is slower than method
0 but has higher endurance to outliers. (default: 1)
--calib_iter CALIB_ITER
The iterations of calibration, total number of images
for calibration = calib_iter * batch_size (default:
100)
--input_nodes INPUT_NODES
The name list of input nodes of the subgraph to be
quantized, comma separated. Used together with
output_nodes. When generating the model for deploy,
only the subgraph between input_nodes and output_nodes
will be included. Please set it to the begining of the
main body fo the model to quantize, such as the nodes
after data preprocessing and augmentation. (default: )
--input_shapes INPUT_SHAPES
The shape list of input_nodes, The shape must be a
4-dimension shape for each node, comma separated, e.g.
1,224,224,3; Unknown size for batchsize is supported,
e.g. ?,224,224,3; In case of multiple input_nodes,
please assign the shape list of each node, separated
by `:`. e.g. ?,224,224,3:?,300,300,1 (default: )
--output_nodes OUTPUT_NODES
The name list of output nodes of the subgraph to be
quantized, comma separated. Used together with
input_nodes. When generating the model for deploy,
only the subgraph between input_nodes and output_nodes
will be included. Please set it to the end of the main
body of the model to quantize, such as the nodes
before postprocessing. (default: )
--ignore_nodes IGNORE_NODES
The name list of nodes to be ignored during
quantization, comma separated. The ignored nodes will
be left unquantized during quantization even if it is
quantizable. This argument has no effect for non-
quantizable nodes. (default: )
--skip_check {0,1} Set to 1 to skip the check for float model. (default:
0)
--align_concat {0,1,2}
The strategy for alignment of the input quantize
positions for concat nodes. Set to 0 to align all
concat nodes, 1 to align the output concat nodes, 2 to
disable alignment (default: 0)
--simulate_dpu {0,1} Set to 1 to enable simulation of DPU. The behavior of
DPU for some operations are different from tensorflow.
For example, the dividing in LeakyRelu and AvgPooling
are replaced by bit-shifting, so there maybe slight
difference between DPU outputs and CPU/GPU outputs.
This quantizer will simulate the behavior for these
operations if this flag is set to 1 (default: 1)
--input_fn INPUT_FN The python importable function that provides the input
data. The format is `module_name.input_fn_name`, e.g.
'my_input_fn.input_fn'. The input_fn should take a
`int` object as input indicating the calibration step,
and should return a dict`(placeholder_node_name :
numpy.Array)` object for each call, which will be fed
into the model's placeholder nodes. (default: )
--max_dump_batches MAX_DUMP_BATCHES
The maximum batches to be dumped (default: 1)
--dump_float {0,1} Set to 1 to dump the float weights/biases and
activation tensors together with the quantized
tensors. (default: 0)
--gpu GPU The gpu id used for quantization, comma separated.
(default: 0)
--gpu_memory_fraction GPU_MEMORY_FRACTION
The gpu memory fraction used for quantization, between
0-1. (default: 0.5)
FastFCNを試す
Vitis-AI-LibraryのSegmentationモデルを試す
awsインスタンスへの必要ツールインストール
vitisのインストールにはここを参照した。
xserverはこのサイトを参照。
awsには既に入ってるので再度これやる必要はない。
xserver
sudo apt install xserver-xorg
/etc/ssh/ssh_configに以下を追記
ForwardX11 yes
ForwardX11Trusted yes
sshdリスタート
sudo service sshd restart
vitisのインストール
xilinxのダウンロードページ
からlinux統合インストーラ自己解答型をダウンロードし、scpかなにかで持ってくる。
そのままguiでインストールするとJAVAエラーが起こるのでcuiでやる。
mkdir vitis-install
./Xilinx_Unified_2020.1_0602_1208_Lin64.bin --noexec --target ~/vitis-install
cd vitis-install
sudo ./xsetup -b AuthTokenGen #xilinxアカウント情報入力
./xsetup -b ConfigGen #vitis(1番)を選択
sudo ./xsetup --agree XilinxEULA,3rdPartyEULA,WebTalkTerms --batch Install --config <設定ファイルのパス/ファイル名>
source /tools/Xilinx/Vivado/2020.1/settings64.sh
vivado
でxserver経由でvivadoが立ち上がったら成功
エラーでvivadoが起動しない
application-specific initialization failed: couldn't load file "librdi_commontasks.so": libtinfo.so.5: cannot open shared object file: No such file or directory
というエラーが出たら、ここを参考に
sudo apt update
sudo apt install libtinfo-dev
sudo ln -s /lib/x86_64-linux-gnu/libtinfo.so.6 /lib/x86_64-linux-gnu/libtinfo.so.5
したら治った
FasterSegを試す
signateのデータセットに対する, FasterSegの精度と速さを測定する.
https://github.com/VITA-Group/FasterSeg
AWSにVitis AIの環境構築をする
onnx形式のモデルから, (tensorflowの).pbファイルを経て, frozen graphを作る
まず, onnxから, tensorflowのGrafDef(.pbファイル)を作る.
onnx-tensorflowはブランチ(tf-1.x)のものを取ってくる(https://github.com/onnx/onnx-tensorflow/tree/tf-1.x).
onnx-tensorflowのgithub内の, https://github.com/onnx/onnx-tensorflow/blob/tf-1.x/example/onnx_to_tf.py に沿って,
import onnx
from onnx_tf.backend import prepare
onnx_model = onnx.load("path/to/model.onnx") # load onnx model
tf_rep = prepare(onnx_model) # prepare tf representation
tf_rep.export_graph("path/to/output_model.pb") # export the model
を実行すると, .pb
ファイルが出来る.
ここで, このコードを実行する環境のtensorflowのバージョンに注意する.
この.pb
ファイルを作った環境のtensorflowのバージョンと, この.pb
ファイルをロードする環境のtensorflowのバージョンが異なっていると, 例えば, tensorflow/tensorflow#22994 や,
Traceback (most recent call last):
File "view_graph.py", line 44, in <module>
graph = load_model()
File "view_graph.py", line 39, in load_model
tf.import_graph_def(graph_def, name="")
File "/home/suzuki/.pyenv/versions/miniconda3-latest/envs/oxnn/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/suzuki/.pyenv/versions/miniconda3-latest/envs/oxnn/lib/python3.7/site-packages/tensorflow/python/framework/importer.py", line 430, in import_graph_def
raise ValueError(str(e))
ValueError: NodeDef mentions attr 'incompatible_shape_error' not in Op<name=Equal; signature=x:T, y:T -> z:bool; attr=T:type,allowed=[DT_BFLOAT16, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_UINT8, ..., DT_QINT8, DT_QINT32, DT_STRING, DT_BOOL, DT_COMPLEX128]; is_commutative=true>; NodeDef: {{node assert_equal_1/Equal}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
といったエラーがでる.
python jit_test.py Illegal instruction (core dumped)
# packages in environment at /opt/vitis_ai/conda/envs/vitis-ai-pytorch:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
blas 1.0 mkl
bzip2 1.0.8 h7b6447c_0
ca-certificates 2020.10.14 0
cairo 1.14.12 h8948797_3
certifi 2020.6.20 pyhd3eb1b0_3
cffi 1.14.0 py36h2e261b9_0
cudatoolkit 10.0.130 0
cudnn 7.6.5 cuda10.0_0
expat 2.2.10 he6710b0_2
ffmpeg 4.0 hcdf2ecd_0
fontconfig 2.13.1 h2176d3f_1000 conda-forge/label/gcc7
freeglut 3.0.0 hf484d3e_5
freetype 2.10.4 h5ab3b9f_0
gettext 0.19.8.1 hd7bead4_3
gflags 2.2.2 he6710b0_0
glib 2.56.2 hd408876_0
glog 0.4.0 he6710b0_0
graphite2 1.3.14 h23475e2_0
graphviz 2.38.0 hcf1ce16_1009 conda-forge/label/gcc7
harfbuzz 1.9.0 he243708_1001 conda-forge/label/gcc7
hdf5 1.10.2 hba1933b_1
icu 58.2 he6710b0_3
intel-openmp 2020.2 254
jasper 2.0.14 h07fcdf6_1
jpeg 9c h14c3975_1001 conda-forge/label/gcc7
json-c 0.13.1 h1bed415_0
lcms2 2.11 h396b838_0
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.2.1 hf484d3e_1007
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libglu 9.0.0 hf484d3e_1
libopencv 3.4.2 hb342d67_1
libopus 1.3.1 h7b6447c_0
libpng 1.6.37 hbc83047_0
libprotobuf 3.11.4 hd408876_0
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.1.0 h2733197_1
libtool 2.4.6 h7b6447c_1005
libuuid 2.32.1 h14c3975_1000 conda-forge/label/gcc7
libvpx 1.7.0 h439df22_0
libxcb 1.14 h7b6447c_0
libxml2 2.9.10 hb55368b_3
lz4-c 1.9.2 heb0550a_3
mkl 2020.2 256
mkl-service 2.3.0 py36he904b0f_0
mkl_fft 1.2.0 py36h23d657b_0
mkl_random 1.1.1 py36h0573a6f_0
ncurses 6.2 he6710b0_1
ninja 1.10.1 py36hfd86e86_0
numpy 1.17.2 py36haad9e8e_0
numpy-base 1.17.2 py36hde5b4d6_0
olefile 0.46 py36_0
opencv 3.4.2 py36h6fd60c2_1
openssl 1.1.1h h7b6447c_0
pango 1.40.14 hf0c64fd_1003 conda-forge/label/gcc7
pcre 8.44 he6710b0_0
pillow 8.0.1 py36he98fc37_0
pip 20.2.4 py36h06a4308_0
pixman 0.40.0 h7b6447c_0
protobuf 3.11.4 py36he6710b0_0
py-opencv 3.4.2 py36hb342d67_1
pybind11 2.5.0 py36hfd86e86_0
pycparser 2.20 py_2
python 3.6.10 hcf32534_1
pytorch 1.1.0 cuda100py36he554f03_0
pytorch_nndct 1.2.0 4_a5f1f456 file:///scratch/conda-channel
readline 8.0 h7b6447c_0
scipy 1.3.1 py36h7c811a0_0
setuptools 50.3.0 py36h06a4308_1
six 1.15.0 py_0
sqlite 3.33.0 h62c20be_0
target_factory 1.2.0 10_65a73cb6 file:///scratch/conda-channel
tk 8.6.10 hbc83047_0
torchvision 0.3.0 cuda100py36h72fc40a_0
tqdm 4.50.2 py_0
unilog 1.2.0 10_4f1575a6 file:///scratch/conda-channel
vart 1.2.0 16_a7d6128b file:///scratch/conda-channel
wheel 0.35.1 py_0
xir 1.2.0 12_69d7e69c file:///scratch/conda-channel
xorg-kbproto 1.0.7 h14c3975_1002 conda-forge/label/gcc7
xorg-libice 1.0.9 h14c3975_1004 conda-forge/label/gcc7
xorg-libsm 1.2.3 h4937e3b_1000 conda-forge/label/gcc7
xorg-libx11 1.6.6 h14c3975_1000 conda-forge/label/gcc7
xorg-libxext 1.3.3 h14c3975_1004 conda-forge/label/gcc7
xorg-libxpm 3.5.12 h14c3975_1002 conda-forge/label/gcc7
xorg-libxrender 0.9.10 h14c3975_1002 conda-forge/label/gcc7
xorg-libxt 1.1.5 h14c3975_1002 conda-forge/label/gcc7
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge/label/gcc7
xorg-xextproto 7.3.0 h14c3975_1002 conda-forge/label/gcc7
xorg-xproto 7.0.31 h14c3975_1007 conda-forge/label/gcc7
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.5 h9ceee32_0
bisenetv2_quant.pyのdebug
人生楽しすぎるンゴ
pytorchで学習したモデルをtensorflowの形式に変換する
"Vitis AI (on Ultra96V2) Custom Platform Tutorial" を試す
tensorflowでBiSenetを動かす
Vitis-aiのgithub(https://github.com/Xilinx/Vitis-AI/tree/v1.2.1/Vitis-AI-Quantizer/vai_q_tensorflow)によると,
Vitis-ai1.2.1はtensorflow1.15を使っているようなので, tensorflow 1.xで動くモデルを用意する.
User Guide(https://www.xilinx.com/support/documentation/sw_manuals/vitis_ai/1_2/ug1414-vitis-ai.pdf)より
p.51: The vai_q_tensorflow quantizer is based on Tensorflow 1.15. The vai_q_pytorch quantizer supports Pytorch from 1.1-1.4.
BiSeNet を試す
https://github.com/CoinCheung/BiSeNet
をFPGA上のCPUで推論させる.
FPGA上にcloneし,pretrainモデルを用いる.
FPGA上での依存ライブラリの一部がコピーされていない
多分シンボリックリンクのままやらなかったせい.
ライブラリはAYのマシンでそのままSDカードにコピーする必要があるみたいなので,SDカードリーダーが必要.
semantic segmentationでNVIDIA DALIの使い方が分からない
公式ドキュメント
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
まずは, 以下のコマンドでインストール
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100
精度、trainの仕方などのアイディア
trainについて
画像サイズがクソデカイので、そのままのロードだとtrainに大変時間がかかる。width=2048で、train時間の85%は画像のロードである。したがって、データセットを事前にダウンサイズすると良いと思われる。例えば、width=1024のを事前に準備しておけば、trainの時間は1/4になると思われる。
app部分で第二回の殿堂入りから持ってこれないか検討する
モデルを量子化する
$ $ freeze_graph --help
usage: freeze_graph [-h] [--input_graph INPUT_GRAPH]
[--input_saver INPUT_SAVER]
[--input_checkpoint INPUT_CHECKPOINT]
[--checkpoint_version CHECKPOINT_VERSION]
[--output_graph OUTPUT_GRAPH]
[--input_binary [INPUT_BINARY]]
[--output_node_names OUTPUT_NODE_NAMES]
[--restore_op_name RESTORE_OP_NAME]
[--filename_tensor_name FILENAME_TENSOR_NAME]
[--clear_devices [CLEAR_DEVICES]]
[--initializer_nodes INITIALIZER_NODES]
[--variable_names_whitelist VARIABLE_NAMES_WHITELIST]
[--variable_names_blacklist VARIABLE_NAMES_BLACKLIST]
[--input_meta_graph INPUT_META_GRAPH]
[--input_saved_model_dir INPUT_SAVED_MODEL_DIR]
[--saved_model_tags SAVED_MODEL_TAGS]
optional arguments:
-h, --help show this help message and exit
--input_graph INPUT_GRAPH
TensorFlow 'GraphDef' file to load.
--input_saver INPUT_SAVER
TensorFlow saver file to load.
--input_checkpoint INPUT_CHECKPOINT
TensorFlow variables file to load.
--checkpoint_version CHECKPOINT_VERSION
Tensorflow variable file format
--output_graph OUTPUT_GRAPH
Output 'GraphDef' file name.
--input_binary [INPUT_BINARY]
Whether the input files are in binary format.
--output_node_names OUTPUT_NODE_NAMES
The name of the output nodes, comma separated.
--restore_op_name RESTORE_OP_NAME
The name of the master restore operator. Deprecated,
unused by updated loading code.
--filename_tensor_name FILENAME_TENSOR_NAME
The name of the tensor holding the save path.
Deprecated, unused by updated loading code.
--clear_devices [CLEAR_DEVICES]
Whether to remove device specifications.
--initializer_nodes INITIALIZER_NODES
Comma separated list of initializer nodes to run
before freezing.
--variable_names_whitelist VARIABLE_NAMES_WHITELIST
Comma separated list of variables to convert to
constants. If specified, only those variables will be
converted to constants.
--variable_names_blacklist VARIABLE_NAMES_BLACKLIST
Comma separated list of variables to skip converting
to constants.
--input_meta_graph INPUT_META_GRAPH
TensorFlow 'MetaGraphDef' file to load.
--input_saved_model_dir INPUT_SAVED_MODEL_DIR
Path to the dir with TensorFlow 'SavedModel' file and
variables.
--saved_model_tags SAVED_MODEL_TAGS
Group of tag(s) of the MetaGraphDef to load, in string
format, separated by ','. For tag-set contains
multiple tags, all tags must be passed in.
$ $ vai_q_tensorflow --help
usage:
usage: vai_q_tensorflow command [Options]
examples:
show help : vai_q_tensorflow --help
quantize a model: vai_q_tensorflow quantize --input_frozen_graph frozen_graph.pb --input_nodes xxx --output_nodes yyy --input_shapes zzz --input_fn module.calib_input
inspect a model : vai_q_tensorflow inspect --input_frozen_graph frozen_graph.pb
dump quantized model : vai_q_tensorflow dump --input_frozen_graph quantize_results/quantize_eval_model.pb --input_fn module.dump_input
Xilinx's Quantization Tools Vai_q_tensorflow v1.0.0 Build for Tensorflow
1.12.0
positional arguments:
{quantize,inspect,dump}
Specify a command for vai_q_tensorflow
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--input_frozen_graph INPUT_FROZEN_GRAPH
The path to input frozen graph(.pb) (default: )
--output_dir OUTPUT_DIR
The directory to save the quantization results
(default: ./quantize_results)
--weight_bit WEIGHT_BIT
The target bit width for weights/biases (default: 8)
--activation_bit ACTIVATION_BIT
The target bit width for activation (default: 8)
--method {0,1} The method for quantization, options are: 0: non-
overflow method, make sure no values are saturated
during quantization, may get bad results incase of
outliers. 1: min-diffs method, allow saturation for
large values during quantization to get smaller
quantization errors. This method is slower than method
0 but has higher endurance to outliers. (default: 1)
--calib_iter CALIB_ITER
The iterations of calibration, total number of images
for calibration = calib_iter * batch_size (default:
100)
--input_nodes INPUT_NODES
The name list of input nodes of the subgraph to be
quantized, comma separated. Used together with
output_nodes. When generating the model for deploy,
only the subgraph between input_nodes and output_nodes
will be included. Please set it to the begining of the
main body fo the model to quantize, such as the nodes
after data preprocessing and augmentation. (default: )
--input_shapes INPUT_SHAPES
The shape list of input_nodes, The shape must be a
4-dimension shape for each node, comma separated, e.g.
1,224,224,3; Unknown size for batchsize is supported,
e.g. ?,224,224,3; In case of multiple input_nodes,
please assign the shape list of each node, separated
by `:`. e.g. ?,224,224,3:?,300,300,1 (default: )
--output_nodes OUTPUT_NODES
The name list of output nodes of the subgraph to be
quantized, comma separated. Used together with
input_nodes. When generating the model for deploy,
only the subgraph between input_nodes and output_nodes
will be included. Please set it to the end of the main
body of the model to quantize, such as the nodes
before postprocessing. (default: )
--ignore_nodes IGNORE_NODES
The name list of nodes to be ignored during
quantization, comma separated. The ignored nodes will
be left unquantized during quantization even if it is
quantizable. This argument has no effect for non-
quantizable nodes. (default: )
--skip_check {0,1} Set to 1 to skip the check for float model. (default:
0)
--align_concat {0,1,2}
The strategy for alignment of the input quantize
positions for concat nodes. Set to 0 to align all
concat nodes, 1 to align the output concat nodes, 2 to
disable alignment (default: 0)
--simulate_dpu {0,1} Set to 1 to enable simulation of DPU. The behavior of
DPU for some operations are different from tensorflow.
For example, the dividing in LeakyRelu and AvgPooling
are replaced by bit-shifting, so there maybe slight
difference between DPU outputs and CPU/GPU outputs.
This quantizer will simulate the behavior for these
operations if this flag is set to 1 (default: 1)
--input_fn INPUT_FN The python importable function that provides the input
data. The format is `module_name.input_fn_name`, e.g.
'my_input_fn.input_fn'. The input_fn should take a
`int` object as input indicating the calibration step,
and should return a dict`(placeholder_node_name :
numpy.Array)` object for each call, which will be fed
into the model's placeholder nodes. (default: )
--max_dump_batches MAX_DUMP_BATCHES
The maximum batches to be dumped (default: 1)
--dump_float {0,1} Set to 1 to dump the float weights/biases and
activation tensors together with the quantized
tensors. (default: 0)
--gpu GPU The gpu id used for quantization, comma separated.
(default: 0)
--gpu_memory_fraction GPU_MEMORY_FRACTION
The gpu memory fraction used for quantization, between
0-1. (default: 0.5)
Vitis-AI 1.2 環境を共有する
Host kernel
HostName 18.218.168.50
User ubuntu
Port 2211
PyTorchでBiSeNetにPruningを実装する
BiSeNetで様々なresizeを試す
久留が整えたオリジナルのBiSeNetのコードを元に,様々なresizeの入力のモデルを試す.
tools/train.pyをargumentを変えて実行する.
具体的には,resolution,num_classを変えて,それぞれに対し適切なlearning rateやweight decayで学習する.
かつ,resolutionとnum_classの組み合わせによって,IoUが要件0.6を満たすかを確かめる.
まずは,1024*512でIoU0.6を超えるかを確かめる.
DPU timeoutエラー
Qiitaに書いてあるようにやって実機で動作させたところ,
Load image : ILSVRC2012_val_00000001.JPEG
Run DPU Task for ResNet50 ...
[DPU mode]
normal
[DPU timeout limitation (in seconds)]
20
[DPU Debug Info]
Core 0 schedule : 3
Core 0 interrupt: 3
[DPU Resource]
DPU Core : 0
State : Idle
PID : 2861
TaskID : 2861
Start : 548890522784
End : 548155570688
[DPU Core 0 Register]
CTL : 0x00000001
GIE : 0x00000001
IRQ : 0x00000000
HP : 0x07070f0f
CODE : 0x0000000000060100
BASE0 : 0x0000000060300000
BASE1 : 0x0000000061c00000
BASE2 : 0x0000000000000000
BASE3 : 0x0000000000000000
BASE4 : 0x0000000000000000
BASE5 : 0x0000000000000000
BASE6 : 0x0000000000000000
BASE7 : 0x0000000000000000
CYCLE_H : 0x00000000
CYCLE_L : 0x00000000
REGVER : 0x31016b02
TIMESTAMP : 0x13b12123
GITID : 0x06aec4c8
GITTIME : 0x71ea5221
VERSION : 0x00000140
TIMER : 0x00000000
ARCH : 0x31240c0c
RAM : 0x00001333
LOAD : 0x00000102
CONV : 0x00000111
SAVE : 0x00000002
POOL : 0x00000001
ELEW : 0x00000001
DWCV : 0x00000012
MISC : 0x00000001
DPU STATUS: 0x00000000
AXI STATUS: 0x00455656
LOAD START: 1412
LOAD END : 1410
SAVE START: 174
SAVE END : 174
CONV START: 1217
CONV END : 1217
MISC START: 89
MISC END : 89
[DNNDK] DPU timeout while execute DPU Task:resnet50-0
vai_q_tensorflowを実行する際の, input_nodesとoutput_nodesを探す
まず, pytorchのモデルをonnx形式に変換する際に,
torch.onnx.export
を用いたが, この際, input_names="image_array"
, output_names="category"
とした.
vai_q_tensorflowのinput fileの内, input_fn.pyを作る
Vitis AI User Guide
p46: For quantize calibration, calibration data without label is enough.
p47: Quantize, using a subset (200 images) of validation data for calibration. Because we are in quantize calibration process, the displayed loss and accuracy are meaningless.
p52: In the quantize calibration process, only a small set of unlabeled images are required to analyze the distribution of activations.
などとあるので, Quantizationは学習を目的としているわけではなさそう(…?)よって, validationデータを用意して, 評価時用の前処理を行えば良い(…?)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.