Git Product home page Git Product logo

mit-han-lab / tinyengine Goto Github PK

View Code? Open in Web Editor NEW
767.0 21.0 130.0 240.4 MB

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory

Home Page: https://mcunet.mit.edu

License: MIT License

C 88.46% Python 2.62% Shell 0.01% Makefile 5.63% HTML 2.10% C++ 0.95% Assembly 0.21% Starlark 0.01% Cuda 0.01%
c codegenerator cpp deep-learning microcontroller pytorch tinyml edge-computing neural-architecture-search quantization

tinyengine's Introduction

TinyEngine

This is the official implementation of TinyEngine, a memory-efficient and high-performance neural network library for Microcontrollers. TinyEngine is a part of MCUNet, which also consists of TinyNAS. MCUNet is a system-algorithm co-design framework for tiny deep learning on microcontrollers. TinyEngine and TinyNAS are co-designed to fit the tight memory budgets.

The MCUNet and TinyNAS repo is here.

demo

demo_v3

News

If you are interested in getting updates, please sign up here to get notified!

Overview

Microcontrollers are low-cost, low-power hardware. They are widely deployed and have wide applications, but the tight memory budget (50,000x smaller than GPUs) makes deep learning deployment difficult.

MCUNet is a system-algorithm co-design framework for tiny deep learning on microcontrollers. It consists of TinyNAS and TinyEngine. They are co-designed to fit the tight memory budgets. With system-algorithm co-design, we can significantly improve the deep learning performance on the same tiny memory budget.

overview

Specifically, TinyEngine is a memory-efficient inference library. TinyEngine adapts the memory scheduling according to the overall network topology rather than layer-wise optimization, reducing memory usage and accelerating the inference. It outperforms existing inference libraries such as TF-Lite Micro from Google, CMSIS-NN from Arm, and X-CUBE-AI from STMicroelectronics.

TinyEngine adopts the following optimization techniques to accelerate inference speed and minimize memory footprint.

  • In-place depth-wise convolution: A unique data placement technique for depth-wise convolution that overwrites input data by intermediate/output data to reduce peak SRAM memory.
  • Patch-based inference: A generic patch-by-patch inference scheduling, which operates only on a small spatial region of the feature map and significantly cuts down the peak memory.
  • Operator fusion: A method that improves performance by merging one operator into a different operator so that they are executed together without requiring a roundtrip to memory.
  • SIMD (Single instruction, multiple data) programming: A computing method that performs the same operation on multiple data points simultaneously.
  • HWC to CHW weight format transformation: A weight format transformation technique that increases cache hit ratio for in-place depth-wise convolution.
  • Image to Column (Im2col) convolution: An implementation technique of computing convolution operation using general matrix multiplication (GEMM) operations.
  • Loop reordering: A loop transformation technique that attempts to optimize a program's execution speed by reordering/interchanging the sequence of loops.
  • Loop unrolling: A loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space-time tradeoff.
  • Loop tiling: A loop transformation technique that attempts to reduce memory access latency by partitioning a loop's iteration space into smaller chunks or blocks, so as to help ensure data used in a loop stays in the cache until it is reused.

inplace_depthwise

By adopting the abovementioned optimization techniques, TinyEngine can not only enhance inference speed but also reduce peak memory, as shown in the figures below.

MAC/s improvement breakdown: mac_result

Peak memory reduction: peakmem_result

To sum up, our TinyEngine inference engine could be a useful infrastructure for MCU-based AI applications. It significantly improves the inference speed and reduces the memory usage compared to existing libraries like TF-Lite Micro, CMSIS-NN, X-CUBE-AI, etc. It improves the inference speed by 1.1-18.6x, and reduces the peak memory by 1.3-3.6x.

measured_result

Save Memory with Patch-based Inference: We can dramastically reduce the inference peak memory by using patch-based inference for the memory-intensive stage of CNNs. measured_result

For MobileNetV2, using patch-based inference allows us to reduce the peak memory by 8x. measured_result

With patch-based infernece, tinyengine achieves higher accuracy at the same memory budgets. measured_result

Code Structure

code_generator contains a python library that is used to compile neural networks into low-level source code (C/C++).

TinyEngine contains a C/C++ library that implements operators and performs inference on Microcontrollers.

examples contains the examples of transforming TFLite models into our TinyEngine models.

tutorial contains the demo tutorial (of inference and training) of deploying a visual wake words (VWW) model onto microcontrollers.

assets contains misc assets.

Requirement

  • Python 3.6+
  • STM32CubeIDE 1.5+

Setup for Users

First, clone this repository:

git clone --recursive https://github.com/mit-han-lab/tinyengine.git

(Optional) Using a virtual environment with conda is recommended.

conda create -n tinyengine python=3.6 pip
conda activate tinyengine

Install dependencies:

pip install -r requirements.txt

Setup for Developers

Install pre-commit hooks to automatically format changes in your code.

pre-commit install

Deployment Example

Please see tutorial to learn how to deploy a visual wake words (VWW) model onto microcontrollers by using TinyEngine. We include both the inference demo and the training demo in the tutorial, please take a look!

Measured Results

  • All the tflite models are from Model Zoo in MCUNet repo. Please see MCUNet repo to know how to build the pre-trained int8 quantized models in TF-Lite format.
  • All the latency, peak memory (SRAM) and Flash memory usage results are profiled on STM32H743 with the limitations of 512 KB peak memory and 2 MB storage.
  • Note that we measure the newer versions of libraries in this repo, so that the results in this repo might be different from the ones in the MCUNet papers.
  • For each inference library, we use the git commit ID to indicate the version.
  • All the tflite models are compiled by -Ofast optimization level in STM32CubeIDE.
  • OOM denotes Out Of Memory.
  • Measurement for X-Cube-AI v7.3.0 was conducted with the default compilation setting of balanced mode.

The latency results:

net_id TF-Lite Micro
@ 713b6ed
CMSIS-NN
@ 011bf32
X-CUBE-AI
v7.3.0
TinyEngine
@ 0363956
# mcunet models (VWW)
mcunet-vww0 587ms 53ms 32ms 27ms
mcunet-vww1 1120ms 97ms 57ms 51ms
mcunet-vww2 5310ms 478ms 269ms 234ms
# mcunet models (ImageNet)
mcunet-in0 586ms 51ms 35ms 25ms
mcunet-in1 1227ms 103ms 63ms 56ms
mcunet-in2 6463ms 642ms 351ms 280ms
mcunet-in3 7821ms 770ms 414ms 336ms
mcunet-in4 OOM OOM 516ms 463ms
# baseline models
proxyless-w0.3-r64 512ms 54kB 35kB 23kB
proxyless-w0.3-r176 3801ms 380ms 205ms 176ms
mbv2-w0.3-r64 467ms 43ms 29ms 23ms

The peak memory (SRAM) results:

net_id TF-Lite Micro
@ 713b6ed
CMSIS-NN
@ 011bf32
X-CUBE-AI
v7.3.0
TinyEngine
@ 0363956
# mcunet models (VWW)
mcunet-vww0 163kB 163kB 88kB 59kB
mcunet-vww1 220kB 220kB 113kB 92kB
mcunet-vww2 385kB 390kB 201kB 174kB
# mcunet models (ImageNet)
mcunet-in0 161kB 161kB 69kB 49kB
mcunet-in1 219kB 219kB 106kB 96kB
mcunet-in2 460kB 469kB 238kB 215kB
mcunet-in3 493kB 493kB 243kB 260kB
mcunet-in4 OOM OOM 342kB 416kB
# baseline models
proxyless-w0.3-r64 128kB 136kB 97kB 35kB
proxyless-w0.3-r176 453kB 453kB 221kB 259kB
mbv2-w0.3-r64 173kB 173kB 88kB 61kB

The Flash memory usage results:

net_id TF-Lite Micro
@ 713b6ed
CMSIS-NN
@ 011bf32
X-CUBE-AI
v7.3.0
TinyEngine
@ 0363956
# mcunet models (VWW)
mcunet-vww0 627kB 646kB 463kB 453kB
mcunet-vww1 718kB 736kB 534kB 521kB
mcunet-vww2 1016kB 1034kB 774kB 741kB
# mcunet models (ImageNet)
mcunet-in0 1072kB 1090kB 856kB 842kB
mcunet-in1 937kB 956kB 737kB 727kB
mcunet-in2 1084kB 1102kB 849kB 830kB
mcunet-in3 1091kB 1106kB 867kB 835kB
mcunet-in4 OOM OOM 1843kB 1825kB
# baseline models
proxyless-w0.3-r64 1065kB 1084kB 865kB 777kB
proxyless-w0.3-r176 1065kB 1084kB 865kB 779kB
mbv2-w0.3-r64 940kB 959kB 768kB 690kB

Citation

If you find the project helpful, please consider citing our paper:

@article{
  lin2020mcunet,
  title={Mcunet: Tiny deep learning on iot devices},
  author={Lin, Ji and Chen, Wei-Ming and Lin, Yujun and Gan, Chuang and Han, Song},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

@inproceedings{
  lin2021mcunetv2,
  title={MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning},
  author={Lin, Ji and Chen, Wei-Ming and Cai, Han and Gan, Chuang and Han, Song},
  booktitle={Annual Conference on Neural Information Processing Systems (NeurIPS)},
  year={2021}
}

@article{
  lin2022ondevice,
  title = {On-Device Training Under 256KB Memory},
  author = {Lin, Ji and Zhu, Ligeng and Chen, Wei-Ming and Wang, Wei-Chen and Gan, Chuang and Han, Song},
  booktitle={Annual Conference on Neural Information Processing Systems (NeurIPS)},
  year = {2022}
}

Related Projects

MCUNet: Tiny Deep Learning on IoT Devices (NeurIPS'20)

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning (NeurIPS'21)

MCUNetV3: On-Device Training Under 256KB Memory (NeurIPS'22)

tinyengine's People

Contributors

lyken17 avatar meenchen avatar nixward avatar raymondwang0 avatar tonylins avatar zerkclown avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinyengine's Issues

STM32CubeIDE version

Hello, I tried the vww example with the stm32f746-disco board. However, as written, it works well in version 1.5.0, but I hope there is a place where I can find the reason and solution for not running in 1.11.0.

.patch file to build it for other boards

Hey, @meenchen

For person_detection example, the "openmv_person_detection.patch" file handles everything for OpenMV H7 only.

For example, for the section:

diff --git a/src/omv/boards/OPENMV4/omv_boardconfig.h b/src/omv/boards/OPENMV4/omv_boardconfig.h
index 412de472..f7da2c03 100644
--- a/src/omv/boards/OPENMV4/omv_boardconfig.h
+++ b/src/omv/boards/OPENMV4/omv_boardconfig.h
@@ -150,16 +150,18 @@
 // The maximum available fb_alloc memory = FB_ALLOC_SIZE + FB_SIZE - (w*h*bpp).
 #define OMV_FFS_MEMORY          DTCM        // Flash filesystem cache memory
 #define OMV_MAIN_MEMORY         SRAM1       // data, bss and heap memory
+#define OMV_MAIN_MEMORY2        SRAM5       // my memory
 #define OMV_STACK_MEMORY        ITCM        // stack memory
 #define OMV_DMA_MEMORY          SRAM2       // DMA buffers memory.
 #define OMV_FB_MEMORY           AXI_SRAM    // Framebuffer, fb_alloc
 #define OMV_JPEG_MEMORY         SRAM3       // JPEG buffer memory.
 #define OMV_VOSPI_MEMORY        SRAM4       // VoSPI buffer memory.
 
-#define OMV_FB_SIZE             (400K)      // FB memory: header + VGA/GS image
-#define OMV_FB_ALLOC_SIZE       (100K)      // minimum fb alloc size
+#define OMV_FB_SIZE             (100K)      // defualt: 400 FB memory: header + VGA/GS image
+#define OMV_FB_ALLOC_SIZE       (50K)      // default: 100 minimum fb alloc size
 #define OMV_STACK_SIZE          (64K)
-#define OMV_HEAP_SIZE           (236K)
+#define OMV_HEAP_SIZE           (136K)
+// #define OMV_HEAP_SIZE           (236K)
 
 #define OMV_LINE_BUF_SIZE       (3 * 1024)  // Image line buffer round(640 * 2BPP * 2 buffers).
 #define OMV_MSC_BUF_SIZE        (2K)        // USB MSC bot data

This is the copied version of OPENMV4. That is why changing the "OPENMV4" parts with "OPENMV4P" is not enough to successfully build it. Further changes required since the "OPENMV4P/omv_boardconfig.h" is completely different than "OPENMV4/omv_boardconfig.h".

Potential changes are tough to guess, so it would be excellent to have some information about how to modify this .patch file. Besides, if there exists any further changes needed, I would be appreciated if you can also mention them.

Thanks in advance.

About SE Block in tinyengine codes.

Hi, I have some questions about tinyengine's codes.

While looking through tinyengine's codes, I found code in TfliteConvertor.py that handles SE Block.
The code includes a comment that reads as follows:

#         -> MEAN -> MEAN -> PWCONV -> PWCONV -> | ADD -> MUL ->     |
#  DWCONV                                        |            -> MUL |
#

here are my questions:

  1. Is it correct that SE Block means Squeeze & Excitation module?

  2. If so, does "ADD -> MUL -> MUL" refer to the h-swish activation function that replaces sigmoid?

Thank you

get_kernel_buffer undefined

image

Thanks for your great jobs. When I use training tutorial, the c file 'convolve_1x1_s8_kbuf.c' and 'convolve_1x1_s8_skip_pad.c' in the int_forward_op, the function 'get_kernel_buffer'/'get_sbuffer_size' is used inside, and this function has an undefined error. May I ask where this function is defined or maybe I've done something wrong? I would appreciate if you could provide some help.

Conversion of FC Layers and Conv Layers

Thanks for the great work. Unfortunately, as of now the library only works for mcunet models. The support for custom models is not completely implemented. For example, to convert a model with fully connected layers. The code generates Conv operation.
_convert_FULLY_CONNECTED located in TfliteConvertor.py returns wrong operations.
def _convert_FULLY_CONNECTED(self, op): ....... op = conv2d.Conv2d(params) return op
Also, codegen only generates depthwise convolution header files with floating point quantization. Meaning that the genModel.c file might contain an operation, which is not yet defined. For instance, convolve_x_y_z_fpreq.h file is not generated.
To resolve these issues, I think fc.py needs to be implemented inside operators folder and code templates need to be implemented for the convolution as well as fully connected layers. Are my objections correct? Are you planning to implement the missing files? If i decide to implement it myself, from where should i start?

Up-to-date ProxylessNAS models?

We're trying to run NAS ourselves using OFA, but you have not open-sourced up-to-date ProxylessNAS models used in the mcunet search. These would be helpful for us to re-create your results and use them in our projects. Are there plans to do this?

Thanks!

patch inference

thanks for your excellent jobs! it's very useful for me to porting ai model on low power edge devices. i'm very interesting about the patch inference method, but i can't find anymore information about the patch method in the codebase. will you provide this code?

Cannot run Codegen to generate code for other models

I was trying to deploy a model with a different input shape to the STM32 board, but running this command raises NotImplementedError:

python examples/tiny_training.py -f full_bp-1x3x128x128-graph.json -D full_bp-1x3x128x128-params.pkl -QAS scale.json -m -g -d -FR

Where scale.json comes from img1 (highlighted)
And both full_bp-1x3x128x128-graph.json & full_bp-1x3x128x128-params.pkl comes from img2 (highlighted)

These 3 files were generated accordingly from the tiny_training repo's compilation/readme.md

Any thoughts for this issue?

img1
img2

Different inference result on my own model using TinyEngine compare to python

Hi, @meenchen. Thanks for your great jobs. As title, when I implemented my own task in the STM32cubeIDE and checked the network inference results, I found that the inference result would appear some biases compare to result of inferring TFlite model using python, especially happens when the deeper the network layer. I would like to ask if these biases are caused by some slight differences between the op in TinyEngine and the op in Tflite? Or have you ever encountered this problem? I would appreciate if you could provide some help. The device I am using is STM32F746G-DISCO, and my tensorflow version is 2.11.0.

Is it possible to use a better resolution than QQVGA?

Hey, @meenchen

Like you have written in the person_detection_demo script,

sensor.set_framesize(sensor.LCD)  # Set frame size to QVGA 160x128

we made our inference on QQVGA (160x128):

sensor.set_framesize(sensor.QQVGA)

(since we use the frame buffer instead of an LCD). It worked very nice. However, I tried to test it with higher resolutions but there did not occur any detection. I was wondering if it is possible to have a higher resolution with using this engine. Thanks a lot in advance.

mcunet model with cmsis-nn

Hi @meenchen @RaymondWang0 I couldn't find any tutorial or demo to run mcunet models with cmsis-nn, can you please point me to the page or please let me know how to generate mcunet model to deploy on cmsis-nn functions.. or can i use the same model with cmsis-nn similar functions?

inference tutorial error

14:35:33 **** Incremental Build of configuration Debug for project TinyEngine_vww_tutorial ****
make -j7 all
arm-none-eabi-g++ -o "TinyEngine_vww_tutorial.elf" @"objects.list" -mcpu=cortex-m7 -T"../STM32F746NGHx_FLASH.ld" --specs=nosys.specs -Wl,-Map="TinyEngine_vww_tutorial.map" -Wl,--gc-sections -static -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mthumb -Wl,--start-group -lc -lm -lstdc++ -lsupc++ -Wl,--end-group
/Applications/STM32CubeIDE.app/Contents/Eclipse/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.7-2018-q2-update.macos64_1.5.0.202011040924/tools/bin/../lib/gcc/arm-none-eabi/7.3.1/../../../../arm-none-eabi/bin/ld:../STM32F746NGHx_FLASH.ld:163: warning: memory region DTCMRAM' not declared Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.o: In function convolve_1x1_s8_ch16_fpreq':
/Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:61: undefined reference to write_q15x2_ia' /Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:61: undefined reference to write_q15x2_ia'
/Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:62: undefined reference to write_q15x2_ia' /Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:62: undefined reference to write_q15x2_ia'
/Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:88: undefined reference to write_q15x2_ia' Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.o:/Users/karl/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/src/kernels/fp_requantize_op/convolve_1x1_s8_ch16_fpreq.c:88: more undefined references to write_q15x2_ia' follow
collect2: error: ld returned 1 exit status
make: *** [makefile:88: TinyEngine_vww_tutorial.elf] Error 1
"make -j7 all" terminated with exit code 2. Build might be incomplete.
14:35:34 Build Failed. 7 errors, 1 warnings. (took 468ms)

I followed your tutorial, using macOS, python 3.6, stm32cubeide 1.5.0, but got this error. Please help to fix it. Thank you very much!!!

screen too dark

Hi, thanks for your work.
I tried the tutorial without a camera and the MCU board is the same as yours. It can detect if there is a person in the picture. But the screen was too dark. Where can I adjust to make the screen brighter.

IDE Compilation Error

Currently following the inference tutorial and attempting to build the the VWW demo (step 3 of the tutorial). STM32CubeIDE seems to be unable to fully compile the project. Any suggestions to where the source of the error might lie?

Is it in the Makefile? or could the error have been caused all the way back during the CodeGeneration step? Am I including the wrong version of MCUNet?

image

Problem when doing inference tutorial

Hi team,

This is quite an amateur question but I'm doing the Inference Tutorial up to step 2 (created a new directory and moved libraries).

So up to this point, I just want to run the empty int main(void) to make sure the libraries are loaded successfully and use another model in the future. I

  • have defined includePath in c_cpp_properties.json of C/C++ Intellisense extension the same as the tutorial's (Image 1)
  • gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)
  • ran main.cpp with C/C++: g++ build active file
  • am using VSCode with C/C++ Intellisense extension. Is this or good choice? Or what do you recommend?

Still, in main.cpp, the error appears in the first line #include "main.h" saying fatal error: main.h: No such file or directory despite it is in Inc and included in includePath. What should I do to resolve this issue?

image
Image 1
image
Image 2

Appreciate any support you can provide :) Please ask me if you need any clarification.
Rodo

Using TinyEngine with TensorFlow Lite custom models & different/custom datasets.

Is it possible to capture/improve the performance of (in terms of accuracy and peak memory usage) a custom tflite already trained model (that I converted from an originally simple keras model) using TinyEngine, when compared to the plain TensorFlow Lite implementation of the same model? Also do I need to add any extra functionality to the existing code-base in order to evaluate my model against my own dataset(dataset form: training & validation sets as numpy arrays, classification problem with 4 classes)?

Any suggestion/guidance would be deeply appreciated on how to conduct the performance analysis described above by using the TinyEngine inference library, given that my model only supports compatible TinyEngine operators(aka neural net layers).
-Antonios.
p.s. Novice fan/user of TinyEngine.

KWS Model availability

Hi @meenchen, is there a KWS model and its source code available as mentioned in Paper1, if yes can you please provide it? else is there any reason why the model is not available and also let us know by when it can be made available?

Would you mind you uploading inplace depthwise kernel file on github?

Hi. Thank you for your contribution.

I generated c++ code of Mobilent V2(net id=mbv2-320kB) model with patch-based option by tiny engine.

when I did end-to-end working test in stm32ide, it's not working with error message that undefined reference to `depthwise_kernel7x7_stride2_inplace_CHW_fpreq'

I spend a lot of time finding this file but I could't.

Would you mind you uploading these file on github?

*error message

11:21:03 **** Incremental Build of configuration Debug for project mbv2_patch_1226 ****
make -j8 all
arm-none-eabi-g++ -o "mbv2_patch_1226.elf" @"objects.list" -mcpu=cortex-m7 -T"../STM32F746NGHx_FLASH.ld" --specs=nosys.specs -Wl,-Map="mbv2_patch_1226.map" -Wl,--gc-sections -static -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mthumb -Wl,--start-group -lc -lm -lstdc++ -lsupc++ -Wl,--end-group
c:\st\stm32cubeide_1.5.0\stm32cubeide\plugins\com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.7-2018-q2-update.win32_1.5.0.202011040924\tools\arm-none-eabi\bin\ld.exe:../STM32F746NGHx_FLASH.ld:163: warning: memory region DTCMRAM' not declared Src/TinyEngine/codegen/Source/genModel.o: In function invoke':
/Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:62: undefined reference to depthwise_kernel7x7_stride2_inplace_CHW_fpreq' /Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:76: undefined reference to depthwise_kernel5x5_stride1_inplace_CHW_fpreq'
/Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:84: undefined reference to depthwise_kernel7x7_stride2_inplace_CHW_fpreq' /Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:90: undefined reference to depthwise_kernel7x7_stride1_inplace_CHW_fpreq'
/Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:112: undefined reference to depthwise_kernel5x5_stride2_inplace_CHW_fpreq' /Users/raymondwang/STM32CubeIDE/workspace_1.5.0/TinyEngine_vww_tutorial/Debug/../Src/TinyEngine/codegen/Source/genModel.c:134: undefined reference to depthwise_kernel7x7_stride1_inplace_CHW_fpreq'
collect2.exe: error: ld returned 1 exit status
make: *** [makefile:88: mbv2_patch_1226.elf] Error 1
"make -j8 all" terminated with exit code 2. Build might be incomplete.

11:21:05 Build Failed. 7 errors, 1 warnings. (took 2s.39ms)

**********************end error message ************************

Once again, Thank you for your work.

mcunet-10fps inference failing shows nonperson always

With the help of the tinyengine/tutorial I am able to run the mcunet_5ps(default) and able to see the inference is working fine, so tried changing model to mcunet_10fps in tinyengine/tutorial/examples/vww.py file able to run the code but the inference is failing (always showing no person though the person is present).

anything I need to take care to run mcunet_10fps?
@meenchen @RaymondWang0 please help, thanks in advance.

mcunet_10fps_failed

Build Guidance for non-STM32 MCUs

Tinyengine is an exciting project to people want to deploy ai models on mcus like me .
I think tinyengine would enable deep learning on huge amounts of edge devices including but not limited to STM32 chips, so please consider writing a makefile template / build guidance for people want to build tinyengine for non-stm32 mcus with arm-linux toolchains .

Thanks a lot!

the code of the audio demo

Hi there,
It's so fantastic to deploy a model on an mcu.
Inspired by your paper,I wanna implement an intelligent doorlock with offline speaker verification system.
However,I cannot find the demo code of KWS model mentioned in the paper.Could you please share it?

Thanks a lot

Tutorial issues

Thanks for the great work! Currently I do not use ArduCam and I just want to test LCD
After following steps described in the tutorial, I can compile without errors but run with some errors:

(1) No source available for "d_expression_1() at 0x8001058"

(2) Even if I move lcdsetup() before "SystemClock_Config()" and try to display string on LCD, it doesn't show anything onto the LCD screen. Do we need to set something else? Or is it just related to the cam device?

(3) Is there any method we can print message in console mode to verify if we enter main() function?

Furthermore, the table of "Measure results", how can we get the time measurements? Use HAL_getTick() before and after invoke()?

Could you please help comment that? Thanks

Using SDRAM of OpenMV instead of SRAM

Hey, @meenchen . it is me again :)

I have realized that there exists a SDRAM which has a 32MB of memory while I was looking for OpenMV H7 Plus' features. This is way bigger than SRAM which has only 1MB. I was wondering if is there a way to embed the TinyEngine to SDRAM.

Just for you to remember, I was trying to handle a memory overflow problem in the firmware for higher resolutions ( #75 ).

I have no limit for the time that the inference takes, I will have a lot of time for inference in the pipeline of my project. So, if using SDRAM for higher resolutions is possible, I won't be suffering because of the time increased.

Thank you so much in advance.

Would you mind upload some kernel of tiny engine?

Hello again, and Thanks for your work.

I found that there is no kernel, patchpadding_depthwise_kernel5x5_stride1_inplace_CHW.c,
patchpadding_depthwise_kernel5x5_stride2_inplace_CHW.c,
and 7x7 stride1 and 2..so on.

file location tinyengine/TinyEngine/src/kernels/int_forward_op.

Would you mind upload these kernel of tiny engine?

If that's impossible, I'd appreciate it if you could let me know as well.

.......................................................................................................................................................

Recently, I successed your Patch-based code genertion(with mcunet github's model-zoo tflite file), and it worked well with some model, on my board. (I solved a lot of error for this work, but anyway it's working.)

I did the end-to-end test on stm32 F746gz-disc board, and camera.(same stuff as the tutorial)

but I found that there is no kernel, patch_depthwise_kernel5x5_stride1_inplace_kernel_CHW.c and patch_depthwise_kernel7x7_stride1_inplace_kernel_CHW.c file in project forder path,
tinyengine/TinyEngine/src/kernels/int_forward_op.

***my Experiment result

(this Experisment is wrong because I changed patchpadding file from 5x5 to 3x3.)

A
B
C

once again, Thanks for your work.

Not implemented error got when trying to generate code for one of the already included networks

Hi, @meenchen, thanks for your great work. As I put in the title I got the following error (see image)

Imagen1

  1. I started by following the tiny-training's repo compilation/README instructions, So: running the mcu_ir_gen.py and using the mcunet-5fps.pkl file.

  2. Then I ran the ir2json.py file selecting the sparse_bp-49kb-1x3x128x128.ir just generated.

  3. Then I took the scale.json from the ir_zoos corresponding folder (generated with mcu_gen_ir), the graph.json and the params.pkl from .model/testproj/ that were generated with ir2json, I put all three in the assets folder.

  4. Finally I ran: python examples/tiny_training.py -f assets/sparse_bp-49kb-1x3x128x128-graph.json -D assets/sparse_bp-49kb-1x3x128x128-params.pkl -QAS assets/scale.json -m -g -d -FR
    Making sure it was using the same 49kb sparse update scheme, and I got that Not implemented error, is there something that I might have missed during the process?

Pd. It seems to be originating from an "abs" operation that could not be handled, but being the provided examples in tiny-training makes me think I missed something at some point. Any thoughts on it?

Demo without LCD/Camera

Thanks for the great work!

For now, the demos are all strongly related to the camera and LCD. I think a demo without camera/lcd, and read image from head file and print output results will be very helpful.

BTW: I'm working on this demo in spare time, but I'm familiar with stm32... Don't know how much time I need to finish this

Recipe for target 'FIRMWARE_OBJS' failed - Error 2

Hello, @meenchen

I get "Error 2" while I am trying to build my OpenMV H7 Plus for your person detection. I could not solve this issue.

make[1]: Leaving directory '/home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src/drivers/winc1500'
omv/ports/stm32/omv_portconfig.mk:593: recipe for target 'FIRMWARE_OBJS' failed
make: *** [FIRMWARE_OBJS] Error 2
make: Leaving directory '/home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src'

How can I handle this? Thanks in advance.

Could you tell me model hyper-parameter in MCUNet paper

Hello, I found your MCUNet paper interesting and I am writing to ask for information on the hyper-parameters used in the implementation of the models used in the experiments. While I believe that you will eventually provide the training code for MCUNet or ProxylessNas, at least, I would appreciate it if you could let me know what hyper-parameters you used for the MobileNet V2 that was used in the MCUNet experiments. Specifically, I am interested in the following models: MobileNet w0.35-r64 in MCUNet V1 and MobileNet w0.35-r144 in MCUNet V2, as well as MobileNet V2, MobileNetV2-RD, and MobileNetV2 (Non-overlap) in Table 5. It would be great if you could provide the information in fp32.

image
image
image

I am looking forward to a prompt response.

Platform-independent operation

Hi team,
Some questions are bothering me.
When I use code generation, arm dependent code is automatically generated,
"depthwise_kernel3x3_stride1_inplace_CHW_fpreq.c",
"#include "arm_nnsupportfunctions.h" //TODO: remove this in the future for self-contained"
This header file is included NN component of CMSIS, and generated code contains these arm dependencies.
If I'm testing a demo on windows or linux, this bothers me, because I need to build a simulation environment to test.
I clearly know that using third-party libraries will speed up operational calculations, but I want to keep things simple.
So is there an implementation that platform independent or does not require third-party library?

I am looking forward to your reply.

mcunet face detection model and code

Hi Team, I would like to run the mcunet timyengine face detection model on M7 board and looking to find the source code but I see that there is only VWW code captured here. Is there a way to find the face detection code ? Also it would be helpful if you can add face detection benchmarking numbers for detection timings.

Thanks in advance.

How can I send "3" to the UART input for the stm32 MCU?

Hi,

Context: I am new to this and going through the on-device training tutorial.

Issue: I am on this instruction: Send "3" to the UART input for the MCU: Training mode
Do I need to use another board to direct the UART communication? Or is there a PC app I can use to perform this UART communication?

Thanks!

TinyEngine convolutional layer has greater latency than ARM's CMSIS-NN

Hello,

I was measuring the latency on one of TinyEngine's convolutional kernels (convolve_s8_kernel3_stride1_pad1) versus CMSIS-NN's fast convolutional kernel (arm_convolve_HWC_q7_fast). The TinyEngine kernel had a latency of appx. 200000 cycles while the CMSIS kernel had a latency of appx. 130000 cycles.

  • Is the additional overhead due to the per channel requantization of Tiny Engine? Could you explain why per channel requantization is needed in the kernel?
  • Have you tried benchmarking the latencies of the frameworks per kernel? If so, could you share the results?

Thank you in advance.

inference demo error

image
When I followed the steps to build, there were these errors. Please help to solve them. Thank you very much

visual wake word (VWW) model end to end workflow?

Hi,

Where do I find jupyter notebook of given example visual wake word (VWW) model in the tutorial folder if I want to check it's model architecture and how it is optimized for MCU (pruned, quantized, model conversion (how C/C++ code is generated from the model)) to follow end to end workflow?

camera do not work

Hello,your job is great.but when I followed your tutorial to do it, the software model is the same, the model of the development board and the model of the camera are also the same, to reproduce the vvw model, but the camera does not work, the screen is completely black.

Recipe for target 'firmware' failed

Hello,

Thanks for your job, everything is amazing. Following is what I am trying to face:

I have done everything correctly and it worked fine on my OpenMV H7 Board. However, when I try to run it on my OpenMV H7 Plus Board, I got an error at the last step of the building the firmware.

I changed TARGET=OPENMV4 to TARGET=OPENMV4P for both while building the source and while recompiling it. In building source part, everything works correctly. But I got the error below at the last step, while recompiling it:

 make[1]: Leaving directory '/home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src/omv'
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: /home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src/build/bin/firmware.elf section `.text' will not fit in region `FLASH_TEXT'
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: /home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src/build/bin/firmware.elf section `.bss' will not fit in region `SRAM1'
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: section .dma_memory VMA [0000000030040000,0000000030043bff] overlaps section .bss VMA [0000000030000adc,0000000030042c8b]
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: section ._heap VMA [0000000030042c8c,000000003007ec8b] overlaps section .dma_memory VMA [0000000030040000,0000000030043bff]
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: section .d2_dma_memory VMA [0000000030043c00,0000000030047bff] overlaps section ._heap VMA [0000000030042c8c,000000003007ec8b]
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: region `SRAM1' overflowed by 257164 bytes
/usr/local/arm-none-eabi/bin/../lib/gcc/arm-none-eabi/10.2.1/../../../../arm-none-eabi/bin/ld: region `FLASH_TEXT' overflowed by 168120 bytes
collect2: error: ld returned 1 exit status
omv/ports/stm32/omv_portconfig.mk:649: recipe for target 'firmware' failed
make: *** [firmware] Error 1
make: Leaving directory '/home/senceryucel/Desktop/tinyengine/examples/openmv_person_detection/openmv/src'

What might be the cause for this? Thanks a lot in advance.

simple convolution outputs not matching with hand calculation and cmsis-nn simp_conv function outputs

Hi @meenchen @RaymondWang0,
context: I am trying to run a simple convolution layer (by using convolve_s8_kernel3_stride1_pad1 function)
and trying to compare the outputs with hand calculation and also with cmsis-nn simple convolution outputs, also trying to compare the timing results for the layer.

Issue: The outputs of (convolve_s8_kernel3_stride1_pad1 function) are not matching with the hand calculated outputs please refer to the attached code snippet below.

FYI: Hand calculated outputs got matched with cmsis-nn simple convolution function.

My question is am I calling correct simple convolution function or is there any other simple convolution function to test the basic convolution using mcunet kernel.

`
#include "arm_math.h"
#include "arm_nnfunctions.h"
#include "arm_nnsupportfunctions.h"
#include "img2col_element.h"
#include "tinyengine_function.h"
#include<stdio.h>

#define CONV_WT_M4 {0, -1, 1, -1, 0, -1, 0, -1, -1, 0, 1, 1, -1, 1, 1, 0, 1, -1, 1, -1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, -1, 1, 0, 0, 0, -1, -1, 1, -1, -1, -1, 1, 1, 1, 1, 1, -1, -1, -1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, -1, -1, -1, 0, 0, 0, 0, 0, 0}

const int8_t in_data[36] =
{
1, 2, 0, 2, 1, 2, 1, 0, 1, 2, 2, 2, 2, 1, 0, 2, 2, 0, 2, 2, 0, 0, 1, 2,
2, 0, 2, 0, 0, 1, 0, 0, 1, 0, 1, 0
};

#define CONV_BIAS_M4 {0, 0}

const int8_t out_data[18] =
{
-2, -1, 4, -2, 4, -1, -1, 7, 1, -1, -5, -2, 3, 3, -2, -3, 1, 1
};

#define CONV_IN_DIM_M4 3
#define CONV_IN_CH_M4 4
#define CONV_KER_DIM_M4 3
#define CONV_PAD_M4 1
#define CONV_STRIDE_M4 1
#define CONV_OUT_CH_M4 2
#define CONV_OUT_DIM_M4 3
#define CONV_BIAS_LSHIFT_M4 0
#define CONV_OUT_RSHIFT_M4 0

static const q7_t conv2_wt[CONV_IN_CH_M4 * CONV_KER_DIM_M4 *
CONV_KER_DIM_M4 * CONV_OUT_CH_M4] = CONV_WT_M4;

static const q7_t conv2_bias[CONV_OUT_CH_M4] = CONV_BIAS_M4;

q7_t scratch_buffer_2[92160];

q15_t * buffer2 = (q15_t *) scratch_buffer_2;

q7_t output_data[CONV_OUT_DIM_M4 * CONV_OUT_DIM_M4 * CONV_OUT_CH_M4];

static const q7_t conv2_out_mul[CONV_OUT_CH_M4] = { 1, 1};

static const q7_t conv2_out_shift[CONV_OUT_CH_M4] = { 30, 30 };

int success_or_not = !tinyengine_status;

int status =
convolve_s8_kernel3_stride1_pad1 ((const q7_t *) in_data, CONV_IN_DIM_M4,
CONV_IN_DIM_M4, CONV_IN_CH_M4, conv2_wt,

conv2_bias, conv2_out_shift, conv2_out_mul,
0, 0, -128, 127, (q7_t *) output_data,
CONV_OUT_DIM_M4, CONV_OUT_DIM_M4,
CONV_OUT_CH_M4, buffer2, 0);

if (success_or_not != status)
{

printf ("Function call Failed\r\n");

}
else
{

printf ("Function call Passed\r\n");

}

`

Uint8 model

Thanks for the great work.
As the tutorial described, the input and output are all of "int8" type.

Is uint8 model supported or we should just convert into int8 instead?
Thanks

Torch->TFlite Converter?

Your example .tflite fliles in the /assets folder, seem like they were generated by a custom tool. At least their description field in the binary is TinyNeuralNetwork Converted. instead of your standard MLIR Converted. or TOCO Converted., coming from tensorflow's tf.lite.TFLiteConverter. Is this correct?

We're trying to convert our own Proxyless models but are having trouble doing so because restricting op support in the code generator. Are there plans to open source a torch->tflite converter?

In the original mcunet submodule (the old MCUNet repo), there's some TensorFlow 1.x code to convert a ProxylessNAS network to TFLite. Do you have updated code for this? And updated Proxyless models? Which ties in with... #5

Thanks!

Profiling method?

First of all, I love the project you are doing!
So my question is: How do you profile the memory (and maybe storage) the model consumes?

I really appreciate any help you can provide.
Rodo

some api used in "GeneralMemoryScheduler.py"

Excuse me, I cannot trace the exact definition of the following api which are related to op

layermem["MAC"] = op.get_macs()
layermem["activation"] = op.get_activation_size()
layermem["scale"] = op.get_scale_size()
layermem["runtime"] = op.get_sbuf_size()
layermem["kernel"] = op.get_kbuf_size()

Where can we find the definition of get_macs()/get_activation_size()/get_scale_size()/get_sbuf_size()/get_kbuf_size() ?
Could you please help comment that?
Thanks

blank screen

Hi, I built the TinyEngine tutorial with zero errors. Then, after the download, the screen goes black immediately. No errors are mentioned. No warnings either. Any thoughts or advice?

image

Code Generation Patch Based Inference Bug

Facing an issue related to PR #26:

Traceback (most recent call last):
File "examples/vww.py", line 31, in
life_cycle_path="./lifecycle.png",
File "/Users/amahmed/Desktop/UMass/Spring_2022/Thesis/tinyengine/code_generator/CodegenUtilTFlite.py", line 70, in GenerateSourceFilesFromTFlite
code_generator.codeGeneration()
File "/Users/amahmed/Desktop/UMass/Spring_2022/Thesis/tinyengine/code_generator/CodeGenerator.py", line 131, in codeGeneration
self._genPatchInference()
File "/Users/amahmed/Desktop/UMass/Spring_2022/Thesis/tinyengine/code_generator/CodeGenerator.py", line 182, in _genPatchInference
last_patch_op_output_buffer_str_for_patch_inference = last_patch_op._getBufferstr(
AttributeError: 'NoneType' object has no attribute '_getBufferstr'

Deleting lines 179 to 181 in CodeGenerator.py removes the error.

No module named "cexample"

Hello, @meenchen

We were trying to integrate this beautiful engine into our H7 Plus Board. We changed the TARGET = OPENMV4 to OPENMV4P as needed. Everything seemed to be working fine, until we tried to run the example person detection script.


import cexample
import sensor


sensor.reset()  # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)  # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.HD)  # Set frame size to QVGA 160x128

while True:
    img = sensor.snapshot()  # Take a picture and return the image.
    ret = cexample.person_detection(img, 0.15)

Since we have no LCD screen, we just modified the code a little to test it via frame buffer. However, when we run, it says that there is no module named "cexample".

We looked a bit in the repo for it, but could not find anything that would help us. Any help will be appreciated. Thanks a lot in advance!

vvw.py error

Hello, I'm trying to do exmaple. However I encounter this error while executing "python vvw.py"

(pytorch) user:~/바탕화면/tinyengine$ python examples/vww.py 
Deriving the memory schedule for 41 activation tensors.
100%|██████████████████████████████████████████████████████████████████████████| 41/41 [00:00<00:00, 185109.22it/s]
Traceback (most recent call last):
  File "/home/user/바탕화면/tinyengine/examples/vww.py", line 28, in <module>
    peakmem = GenerateSourceFilesFromTFlite(
  File "/home/user/바탕화면/tinyengine/code_generator/CodegenUtilTFlite.py", line 54, in GenerateSourceFilesFromTFlite
    memory_scheduler.allocateMemory()
  File "/home/user/바탕화면/tinyengine/code_generator/GeneralMemoryScheduler.py", line 190, in allocateMemory
    self.allocator.visualize(self.mem_visual_path)
  File "/home/user/바탕화면/tinyengine/code_generator/allocator/base_allocator.py", line 240, in visualize
    plt.savefig(path, dpi=FIGURE_CONFIG["DPI"])
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/pyplot.py", line 942, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/figure.py", line 3272, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2338, in print_figure
    result = print_method(
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2204, in <lambda>
    print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 410, in wrapper
    return func(*inner_args, **inner_kwargs)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 520, in print_png
    self._print_pil(filename_or_obj, "png", pil_kwargs, metadata)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 466, in _print_pil
    FigureCanvasAgg.draw(self)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 408, in draw
    self.figure.draw(self.renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/artist.py", line 74, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/figure.py", line 3069, in draw
    mimage._draw_list_compositing_images(
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/image.py", line 131, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 3099, in draw
    self.patch.draw(renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/artist.py", line 51, in draw_wrapper
    return draw(artist, renderer)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/patches.py", line 589, in draw
    self._draw_paths_with_artist_properties(
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/patches.py", line 574, in _draw_paths_with_artist_properties
    renderer.draw_path(gc, *draw_path_args)
  File "/home/user/anaconda3/envs/pytorch/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 149, in draw_path
    self._renderer.draw_path(gc, path, transform, rgbFace)
TypeError: must be real number, not str

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.