tensorflow / tflite-micro Goto Github PK

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).

License: Apache License 2.0

Starlark 4.15% C 3.49% C++ 74.01% Makefile 1.42% Jupyter Notebook 0.27% Shell 2.02% Python 13.44% Pascal 0.03% Mako 0.21% NASL 0.05% Dockerfile 0.01% Assembly 0.14% BitBake 0.75%

TensorFlow Lite for Microcontrollers

TensorFlow Lite for Microcontrollers is a port of TensorFlow Lite designed to run machine learning models on DSPs, microcontrollers and other devices with limited memory.

Additional Links:

Build Status

GitHub Status

Official Builds

Build Type	Status
CI (Linux)
Code Sync

Community Supported TFLM Examples

This table captures platforms that TFLM has been ported to. Please see New Platform Support for additional documentation.

Platform	Status
Arduino
Coral Dev Board Micro	TFLM + EdgeTPU Examples for Coral Dev Board Micro
Espressif Systems Dev Boards
Renesas Boards	TFLM Examples for Renesas Boards
Silicon Labs Dev Kits	TFLM Examples for Silicon Labs Dev Kits
Sparkfun Edge
Texas Instruments Dev Boards

Community Supported Kernels and Unit Tests

This is a list of targets that have optimized kernel implementations and/or run the TFLM unit tests using software emulation or instruction set simulators.

Build Type	Status
Cortex-M
Hexagon
RISC-V
Xtensa
Generate Integration Test

Contributing

See our contribution documentation.

Getting Help

A Github issue should be the primary method of getting in touch with the TensorFlow Lite Micro (TFLM) team.

The following resources may also be useful:

SIG Micro email group and monthly meetings.
SIG Micro gitter chat room.
For questions that are not specific to TFLM, please consult the broader TensorFlow project, e.g.:
- Create a topic on the TensorFlow Discourse forum
- Send an email to the TensorFlow Lite mailing list
- Create a TensorFlow issue
- Create a Model Optimization Toolkit issue

Additional Documentation

RFCs

tflite-micro's People

Contributors

Stargazers

Watchers

Forkers

advaitjain jwithers 00mjk philippvk saiprasad16 cflin-cjcu rsun-bdti sjoshi768 foss-for-synopsys-dwc-arc-processors antmicro mansnils bhanuprakashbv ddavis-2015 yorange1 jenselofsson freddan80 yanminge mike-dooley lupyuen andrewxcav marconi1964 njeffrie sungsoosmess codedcclxxvii patriklaurell shlmregev rockyrhodes driedler cad-audio nyadla-sys alanvgreen henriwoodcock yair-ehrenwald kimys1324 shamoo100 arjunk23 mcedrdiego tqrahman binarman sjs2109 wangguangyuan deqiangc xiuxiusen zephyrproject-rtos odinshen-git odincodeshen xmos ray-go mocleiri linxiaohon chuweichen linux-on-ibm-z v15a1 s10143806h dh-dao silabs-ludvigs kh-quan thefakhreddin mmajchrzycki edgeimpulse laurenmurphyx64 yuchengwang skyofwinter xing-w leerobot007 annietllnd kennykang7012 vikramdattu vjsubas jinxzznvg jixinglong placrosse elm8116 meghnanatraj huanmei9 sensiml verma-rishabh notplus hd-faria fabiangroeger96 ubinos yaakovkarsenty sz0101 yaqianliu-617 jackagoogle gpuranik xiaochutan123l pyjhzwh joeupwu lluwang lemonschwein rellikjaeger stjordanis otvaket krzim evolation sunitroy2703 balyshevartem azurecloudmonk olinux99

tflite-micro's Issues

Casting int32_t* to int* in call to tflite::PopulateConvolutionQuantizationParams

All calls to tflite::PopulateConvolutionQuantizationParams seem to cast the per_channel_output_shift param from int32_t* to int*.

For example https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/micro/kernels/depthwise_conv_common.cc#L116

This is not causing problems currently, but I expect that it may be a source of future bugs.

Broken project generation for ARC

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Any
TensorFlow installed from (source or binary): Source
Tensorflow version (commit SHA if source): Latest from main branch tflite-micro
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): ARC

Hi, I've noticed some changes in micro Makefile, which caused troubles in project generation for ARC platform.
These lines: link affects the name of the folder, where the project will be generated.
Sometime ago, these lines were here: link under include of target specific makefile: link. So, this change just broke up project generation for ARC, as we used TARGET value to add the name of tcf file which was used to build embARC MLI library and will be used to run application on nSIM.

So, can I fix it moving these lines back where they were (under include of specific make file)?

Arduino library creation external to the tflite-micro repository

@tensorflow/micro

Generate the Arduino library using the Python project generation interface (https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/micro/tools/project_generation/create_tflm_tree.py) and move all the Arduino specific transformations out of the tflite-micro repository and into https://github.com/tensorflow/tflite-micro-arduino-examples

Internal Bug: http://b/193823889

Only run Vela if needed for Ethos-U example

Currently when running FVP target it always Vela convert the int8 model regardless when running with OPTIMIZED_KERNEL_DIR=ethos_u or not. This should only happen with OPTIMIZED_KERNEL_DIR=ethos_u.

MLI Library 2.0 integration for ARC target.

embARC MLI Library version 2.0 for VPX processor has been released for early access (2.0_EA) and we plan to support MLI 2.0 kernels in the TFLM ARC-specific backend. We plan to maintain MLI 1.1 backwards compatibility and make MLI 2.0 as an experimental feature enabled at compile-time.

What is planned:

Update ARC-specific TFLM backend to support MLI Library version 2.0.
Provide offline adaptation tool to support the MLI 2.0 data layout in TFLM.
Keep MLI Library version 1.1 as primary supported version, with option to enable version 2.0 support. Eventually we'll transition completely to MLI 2.0.
Update all ARC specific READMEs according to these updates.

This update will be partitioned into several PRs to keep all updates clear and tracked.

Add 'Error' prefix to TF_LITE_REPORT_ERROR

@tensorflow/micro

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): N/A
Tensorflow version (commit SHA if source): 9578a394a0ebfa5f77f3e3b87f7b7fa266c97103
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): Any

Describe the problem
It would be great if macro TF_LITE_REPORT_ERROR() prefixed the output with 'Error' or similar. That makes it easier for CI systems to parse for new issues. In a similar fashion, a macro TF_LITE_REPORT_WARNING() would be nice for warning. This could be a good location for the prefix:

#define TF_LITE_REPORT_ERROR(reporter, ...)                             \
  do {                                                                  \
    static_cast<tflite::ErrorReporter*>(reporter)->Report(__VA_ARGS__); \
  } while (false)
#else  // TF_LITE_STRIP_ERROR_STRINGS
#define TF_LITE_REPORT_ERROR(reporter, ...)
#endif  // TF_LITE_STRIP_ERROR_STRINGS

Let me know if there's a better way to achieve this.

Please provide the exact sequence of commands/steps when you ran into the problem
Any error print.

TFLM project generation support (version 2)

Similar to tensorflow/tensorflow#45086 and tensorflow/tensorflow#44909, but broader in scope.

Some high level goals for project generation v2 are:

Having the bulk of the project generation logic moved out of the TFLM Makefiles.
Having TFLM support a Python interface that can be used for project generation
Moving the specific logic for project generation out of the Tensorflow repository and into platform/IDE specific github repos.

This work is currently at the early prototyping stage and this github issue will be used to track progress.

Xtensa optimized kernels for AveragePool and MaxPool

#139 is adding Xtensa optimized implementations for AveragePool and MaxPool.

Prior to being able to merge the Xtensa implementations, we will need to refactor the reference implementation (and the CMSIS optimized implementation) to better share code and reduce the maintenance overhead.

This issu will track the work needed for such a refactoring as well as getting the Xtensa kernels merged.

Get Bent Gaming

@GetBentGaming

Some problems when I am using the vela compiler (Input(s) and Output tensors must not be dynamic)

@tensorflow/micro

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Windows 10
TensorFlow installed from (source or binary):anaconda installation
Tensorflow version (commit SHA if source):2.3.0
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.):ethos-U55(I expect to use this)

Describe the problem

I want to use ethos-U55, so I need to use the vela compiler.

But the following warning occurred when I used

Warning: PACK 'sequential/reshape/Reshape/shape' is not supported on the NPU. Placing on CPU instead
 - Input(s) and Output tensors must not be dynamic
   Op has dynamic tensor(s): sequential/reshape/strided_slice2
Warning: STRIDED_SLICE 'sequential/reshape/strided_slice2' is not supported on the NPU. Placing on CPU instead
 - Input(s) and Output tensors must not be dynamic
   Op has dynamic tensor(s): sequential/reshape/strided_slice2

Please provide the exact sequence of commands/steps when you ran into the problem

First of all, I'm not sure if it should be posted in this category.

If there is an error, please forgive me.

This is the warning I encountered when using vela compiler

The OP in PACK and STRIDED_SLICE has the warning "Input(s) and Output tensors must not be dynamic"

I'm not sure if this is a coding problem, or if this op will generate these kinds of tensors.

If it is a coding problem, how should I change it?

tflite model:
https://drive.google.com/drive/folders/1x7wA4G2qr4m1wmAxnguCh2uQSGYtl-KL?usp=sharing

The following is the code of my model:
Refer to this website https://www.tensorflow.org/lite/performance/post_training_integer_quant

model = tf.keras.Sequential([
        tf.keras.layers.InputLayer(input_shape=(28, 28)),
        tf.keras.layers.Reshape(target_shape=(28,28,1),input_shape=(28,28,1)),
        tf.keras.layers.Conv2D(filters=12, kernel_size=(3,3),activation='relu'),
        tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(10)
])

shufflenet channel shuffle error

I use shufflenetv2, the channel_shuffle unit is:

_, height, width, channels = inputs.shape.as_list()  
channels_per_group = channels  // groups
# reshape  
x = layers.Reshape((height, width], groups, channels_per_group))(inputs)  
# transpose
x = layers.Permute((1,2,4,3))(x)
# flatten
x = layers.Reshape((height, width, channels ))(x)

But this operation:

_, height, width, channels = inputs.shape.as_list()

in tflite model is regard as Shape and StridedSlice operation, StridedSlice only support FP32/INT8/UINT8 output datatype, so this node will failed when invoke.

How can i fix it?

bazel asan build showing linker warning

This command:

CC=clang bazel build tensorflow/lite/micro/benchmarks:keyword_benchmark --config=asan

Gives the following warnings:

INFO: Analyzed target //tensorflow/lite/micro/benchmarks:keyword_benchmark (27 packages loaded, 336 targets configured).
INFO: Found 1 target...
INFO: From Linking tensorflow/lite/micro/benchmarks/keyword_benchmark:
/usr/bin/ld.gold: warning: Cannot export local symbol '__asan_extra_spill_area'
/usr/bin/ld.gold: warning: Cannot export local symbol '__lsan_current_stage'

Corresponding internal bug: http://b/193183334

Remove uint8 support in CMSIS-NN kernels

Uint8 support is being removed in TFLM. CMSIS-NN kernels still have some uint8 support and it should be removed.

RNN support for Tensorflow Lite Micro

System information

TensorFlow version: v1.12.1-23779-g96c5c8a 2.2.0-dev20200202
Are you willing to contribute it: No

Describe the feature and the current behavior/state.
Support of RNNs is currently missing in Tensorflow Lite Micro. I've been testing with an RNN with GRU cells. Simple code (from here):

import tensorflow as tf

model = tf.keras.Sequential()

model.add(tf.keras.layers.Input(shape=(1, 1,)))

cell = tf.keras.layers.GRUCell(10)

model.add(tf.keras.layers.RNN(cell, unroll=True))

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.experimental_new_converter = True

tflite_model = converter.convert()

# for testing if operations are implemented by Tensorflow Lite
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
interpreter.invoke()

The current situation is (missing ops):

experimental_new_converter=True, unroll=True: builtin ops SHAPE, TRANSPOSE, FILL, SPLIT_V, SUB, TANH
experimental_new_converter=True, unroll=False: 3 subgraphs (only 1 subgraph supported)
experimental_new_converter=False, unroll=True: builtin ops SPLIT_V, SUB, TANH
experimental_new_converter=False, unroll=False: custom ops TensorListFromTensor, TensorListReserve, TensorListStack, While

My questions are:

Are there any tangible plans to implement the missing ops for RNNs for Tensorflow Lite Micro and if so, which one of the four variants will be the way to go?
Is it currently possible to somehow make the Toco converter for the non-unrolled case (i.e. experimental_new_converter=False, unroll=False) to convert the RNN just as a single (unsupported) placeholder RNN op rather than splitting it up into the four (unsupported) operators TensorListFromTensor, TensorListReserve, TensorListStack, While?

Thank you.

Will this change the current api? How?
No

Who will benefit with this feature?
Everybody who needs RNNs with Tensorflow Lite Micro.

Any Other info.
None

Segmentation fault when running micro_features_generator_test

@tensorflow/micro

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
TensorFlow installed from (source or binary): source
Tensorflow version (commit SHA if source): b4a116f
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): x86

Describe the problem
Running make -f tensorflow/lite/micro/tools/make/Makefile test results in segmentation fault in micro_features_generator_test.

Testing TestMicroFeaturesGeneratorYes
tensorflow/lite/micro/examples/micro_speech/Makefile.inc:259: recipe for target 'test_micro_features_generator_test' failed
make: *** [test_micro_features_generator_test] Segmentation fault (core dumped)

Please provide the exact sequence of commands/steps when you ran into the problem

$ make -f tensorflow/lite/micro/tools/make/Makefile BUILD_TYPE=debug test_micro_features_generator_test

Running the binary in gdb produces the following stacktrace:

$ gdb --quiet tensorflow/lite/micro/tools/make/gen/linux_x86_64_debug/bin/micro_features_generator_test 
Reading symbols from tensorflow/lite/micro/tools/make/gen/linux_x86_64_debug/bin/micro_features_generator_test...done.
(gdb) r
Starting program: tensorflow/lite/micro/tools/make/gen/linux_x86_64_debug/bin/micro_features_generator_test 
Testing TestMicroFeaturesGeneratorYes

Program received signal SIGSEGV, Segmentation fault.
0x0000555555556da4 in CalculateCenterFrequencies (center_frequencies=0x5555557605e6 <fixed_pool+9382>, upper_frequency_limit=7500, lower_frequency_limit=<optimized out>, num_channels=41) at tensorflow/lite/experimental/microfrontend/lib/filterbank_util.c:49
49          center_frequencies[i] = mel_low + (mel_spacing * (i + 1));
(gdb) backtrace
#0  0x0000555555556da4 in CalculateCenterFrequencies (center_frequencies=0x5555557605e6 <fixed_pool+9382>, upper_frequency_limit=7500, lower_frequency_limit=<optimized out>, num_channels=41) at tensorflow/lite/experimental/microfrontend/lib/filterbank_util.c:49
#1  FilterbankPopulateState (config=config@entry=0x7fffffffd3f0, state=state@entry=0x55555575e0a8 <(anonymous namespace)::g_micro_features_state+104>, sample_rate=sample_rate@entry=16000, spectrum_size=257) at tensorflow/lite/experimental/microfrontend/lib/filterbank_util.c:97
#2  0x00005555555578b8 in FrontendPopulateState (config=config@entry=0x7fffffffd3e0, state=state@entry=0x55555575e040 <(anonymous namespace)::g_micro_features_state>, sample_rate=sample_rate@entry=16000) at tensorflow/lite/experimental/microfrontend/lib/frontend_util.c:45
#3  0x0000555555554f9b in InitializeMicroFeatures (error_reporter=0x7fffffffd450) at tensorflow/lite/micro/examples/micro_speech/micro_features/micro_features_generator.cc:53
#4  0x00005555555549b2 in main (argc=<optimized out>, argv=<optimized out>) at tensorflow/lite/micro/examples/micro_speech/micro_features/micro_features_generator_test.cc:34

The same problem also occurs for test_feature_provider_mock_test and test_feature_provider_test

micro_features_generator_test failing for the Xtensa target

The following command:

make -f tensorflow/lite/micro/tools/make/Makefile TARGET=xtensa TARGET_ARCH=fusion_f1 OPTIMIZED_KERNEL_DIR=xtensa XTENSA_CORE=F1_190305_swupgrade test_micro_features_generator_test -j8

Fails with:

Testing TestMicroFeaturesGeneratorYes
*WARNING* Unhandled user exception: LoadStoreAlignmentCause (0x000130da)( Xtensa ISS ) *WARNING* run exited with status 'virtual breakpoint'

The immediate workaround to get the Xtensa build to pass again will be to disable this failing test.

Sigmoid,tanh: TFL int16 reference code compared to float implementation is off by 3-4 bits

@advaitjain , @njeffrie, @petewarden, @nyadla-sys

Describe the problem

TFL int16 reference code compared to float implementation is off by 3-4 bits (int8 is off by 1 bit).
TFLM does not have int16 support yet.
Sigmoid shows maxdiff of 6 (3 bit).
Tanh shows maxdiff of 12 (4 bit).
Cadence’s hardware implementation is 1 bit off and that’s how it was discovered that TFLM implementation was off
A bug with test vectors will help the TFLM team figure out what needs to be done here

Source code / logs

input_file_sigmoid_Q12.txt
input_file_tanh_Q12.txt
Attached couple of files with input data which shows maximum difference. This input list is for Q12 (1 sign bit, 3 integer bits, 12 fractional bits). We also observed 3-4 bit difference for other Q format as well (Q11 and Q15).

Update ethosu driver entrypoint

@advaitjain This issue tracks work on updating the ethosu driver entrypoint

Project generation fail, missing gemmlowp

System information
Ubuntu 20.04

I want to use Project generation in TFLM to output a tree containing only the sources and headers needed to use TFLM for a specific configuration. And my steps are as follows
STEP1:
git clone https://github.com/tensorflow/tensorflow.git
STEP2:
cd tensorflow
STEP3:
python3 tensorflow/lite/micro/tools/project_generation/create_tflm_tree.py --makefile_options="OPTIMIZED_KERNEL_DIR=cmsis_nn TARGET_ARCH=cortex-m4" "project"

The execution of STEP3 fails, and the output is as follows:

Traceback (most recent call last):
  File "tensorflow/lite/micro/tools/project_generation/create_tflm_tree.py", line 193, in <module>
    _copy(src_files, dest_files)
  File "tensorflow/lite/micro/tools/project_generation/create_tflm_tree.py", line 105, in _copy
    shutil.copy(src, dst)
  File "/usr/lib/python3.8/shutil.py", line 415, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.8/shutil.py", line 261, in copyfile
    with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: 'tensorflow/lite/micro/tools/make/downloads/gemmlowp/fixedpoint/fixedpoint.h'

And I found that this problem is caused by not download third_party code，the list_third_party_* rules will output the path of some undownloaded third_party files, causing error in project_generation copy function.

I think we can download third_party code before executing the list_third_party_* rules to avoid this problem.

Remove unit8 support from kernels prior to adding new optimized implementations

We would like TFLM to drop support for asymmetric quantization and uint8 support (see https://github.com/tensorflow/tensorflow/issues/44912).

As a result, for new optimized implementations (such as tensorflow/tensorflow#46500) we need to ifdef the uint8 test cases in a way that is not scalable:
https://github.com/tensorflow/tensorflow/pull/46500/files?file-filters%5B%5D=.cc#diff-8fb35117d2592a6f74def12c8b12e4d19cec0e9a05f2cd64fcceb84c22d93faaR336-R343

Instead of such ifdefs, we should remove uint8 support from the existing kernel implementations prior to adding in new optimized kernels.

More generally, as we start adding int16 support, it is possible that while the reference kernel has support for int16, some of the optimized implementations will not. In order to keep the tests passing, we will need a more scalable way to filter out specific test cases.

A (potentially sufficient) solution would be to ifdef based on supported types (e.g. DISABLE_INT16). However, this can be a slippery path since there are other 'features' that the optimized kernels may not support, so we will want to limit such ifdefs if we do go down this path.

Already, it is the case that the xtensa kernels only support int8 and not float. As we have other such examples (potentially with the work that the folks at CEVA are currently doing), we should figure out what we want to do here.

Port Int16 support to MEAN kernel

This issue tracks my work on porting Int16 support to the MEAN kernel in micro. @advaitjain

It will be delivered as one PR with two commits (one for porting kernel, one for tests).

Question about image conversion in person detection example

Hi,

I want to use my own image to test person detection so I need to convert my images to a (96 * 96) C array. I follow the instructions below, but the byte array I get is too long.

tflite-micro/tensorflow/lite/micro/examples/person_detection/person_image_data.h

Lines 17 to 20 in f583f92

 // Convert original image to simpler format: 

 // convert -resize 96x96\! person.PNG person.bmp3 

 // Skip the 54 byte bmp3 header and add the reset of the bytes to a C array: 

 // xxd -s 54 -i /tmp/person.bmp3 > /tmp/person.cc

For instance, I test it with my Github avatar:

Here are the instructions I use to generate C array file

convert -colorspace Gray avatar.png avatar_g.png (to convert it to grayscale img)
convert -resize 96x96! avatar_g.png avatar.bmp3
xxd -s 54 -i avatar.bmp3 > avatar.cc

The result C array is of length 27648 (96 * 96 * 3) I do not figure out how to convert it to single-channel since the expected input is (96 * 96). I tried but failed. Could you help take a look at this and update the instructions? Thanks in advance!

keil projects build unexpected big executable

By some investigation, keil projects build will link about 83KB unwanted C++ library objects, including the largest locale.cpp.o (~55KB).
I've already found the inclusion of below files will cause keil (ARM AC6) to link them:

Which are included or indirectly included by kernel_utils.cc.

It seems is not required, but is required for succesful build.

Allow pull requests created from forks of the tflite-micro repo to also change labels

The first attempt to have a github workflow remove the ci:run label on completion of the CI ran into permission errors when the PR was created from a fork of the repo.

https://github.com/tensorflow/tflite-micro/runs/2385859520?check_suite_focus=true

Some notes on this topic:
The security context for the GITHUB_TOKEN changes depending if the request is coming from a forked repo. Basically it loses all write permissions. https://docs.github.com/en/actions/reference/authentication-in-a-workflow#permissions-for-the-github_token

Xtensa optimized kernels for add and mul

As noted in #115 (review) we will need some refactoring prior to adding in optimized implementations for add and mul for Xtensa.

This will include refactoring the reference and cmsis_nn kernels and then adding the Xtensa kernels with optimized calls for Hifi5 and reference fallback for all other target architectures.

Broken links to images in the RFC for pre-allocated tensors

The RFC for pre-allocated tensors (https://github.com/tensorflow/tflite-micro/blob/main/tensorflow/lite/micro/docs/rfc/001_preallocated_tensors.md) have broken images links and they need to be mended.

Corstone300 (FVP) build is not completing correctly

Looks like some recent changes have resulted in the FVP run hanging and then getting timed out after 6 hours:

https://github.com/tensorflow/tflite-micro/runs/2845494703?check_suite_focus=true

micro: port op BATCH_MATMUL from lite

@tensorflow/micro

This issue tracks my work porting operator BATCH_MATMUL from lite to micro.

The port will be submitted in a number of PRs. Here's a rough flight plan per @advaitjain and @petewarden:

PR 1: Extract the code for parsing the op from a flatbuffer out of ParseOpDataTfLite in tensorflow/lite/core/api/flatbuffer_conversions.cc into a standalone function that can be called from micro's op resolver
PR 2: Extract the reference implementation out of tensorflow/lite/kernels/internal/reference/reference_ops.h into its own header which can be included without dragging in reference_ops.h's dependences
PR 3: Copy operator from lite to micro making minimal changes and not including in the build
PR 4: Delete extra code from the micro copy of the operator
PR 5: Port micro copy of operator as necessary and add a corresponding test

recorded_allocation parameters for quantized tensors are incorrect for BE machines

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 2.3.1
Python version: 3.6.9
Bazel version (if compiling from source): 3.4.1
GCC/Compiler version (if compiling from source): Ubuntu 7.5.0-3ubuntu1~18.04
CUDA/cuDNN version: N/A
GPU model and memory: N/A

Describe the current behavior
The quantized tensor count and bytes are not recorded properly on s390x.
While running //tensorflow/lite/micro:recording_micro_allocator_test on s390x, test case fails with following error:

tensorflow/lite/micro/recording_micro_allocator_test_binary: FAIL - '~~~ALL TESTS PASSED~~~' not found in logs.
Testing TestRecordsTfLiteTensorArrayData
Testing TestRecordsTensorArrayQuantizationData
recorded_allocation.count == quantized_tensor_count * 2 failed at tensorflow/lite/micro/recording_micro_allocator_test.cc:110 (51 vs 24)
recorded_allocation.requested_bytes == expected_requested_bytes failed at tensorflow/lite/micro/recording_micro_allocator_test.cc:112 (1436 vs 752)
Testing TestRecordsNodeAndRegistrationArrayData
2/3 tests passed

On debugging, it seems that RecordingSimpleMemoryAllocator::AllocateFromTail function gets called twice the times it's being called on x86. AllocateFromTail function updates used_bytes_, requested_bytes_ and alloc_count_ which are used to update recorded_allocation parameters in RecordingMicroAllocator::RecordAllocationUsage function.
For x86, AllocateFromTail is only called when extracting tensor data from serialized tensor buffers but on s390x, FlatBufferVectorToTfLiteTypeArray also calls AllocateFromTail to convert endianness of flatbuffer array in case of BE systems. This causes the data in requested_bytes_ and alloc_count_ private variables of class RecordingSimpleMemoryAllocator to be incorrect.

One way to fix this could to be take a global variable, we can set it when calling AllocatefromTail to convert endianness. If this variable is set, then we do not update recorded_allocation parameters. Can you suggest a better way to approach this problem?

Describe the expected behavior
The RecordingSimpleMemoryAllocator should record recorded_allocation parameters correctly and the test case should pass.

Standalone code to reproduce the issue

bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test --host_javabase="@local_jdk//:jdk" --test_tag_filters=-gpu,-benchmark-test,-v1only,-no_oss,-oss_serial -k --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors --verbose_failures  -- //tensorflow/lite/micro:recording_micro_allocator_test

Other info / logs
Code in AllocateFromTail that is recording the count:

uint8_t* RecordingSimpleMemoryAllocator::AllocateFromTail(size_t size,
                                                          size_t alignment) {
  const uint8_t* previous_tail = GetTail();
  uint8_t* result = SimpleMemoryAllocator::AllocateFromTail(size, alignment);
  if (result != nullptr) {
    used_bytes_ += previous_tail - GetTail();
    requested_bytes_ += size;
    alloc_count_++;
  }
  return result;
}

The code that is calling AllocateFromTail more than it should is in FlatBufferVectorToTfLiteTypeArray which being called from FlatBufferVectorToTfLiteTypeArray:

 if (FLATBUFFERS_LITTLEENDIAN) {
    // On little-endian machines, TfLite*Array happens to have the same memory
    // layout as flatbuffers:Vector<kFlatBufferVectorType>, so we can
    // reinterpret_cast the flatbuffer vector and avoid a copy and malloc.
    *result = const_cast<kTfLiteArrayType*>(
        reinterpret_cast<const kTfLiteArrayType*>(flatbuffer_array));
  } else {
    // Big-endian architecture can not use the same memory layout as
    // flatbuffers::Vector<kFlatBufferVectorType>. Allocate from the tail and
    // copy values from the flatbuffer into the newly allocated chunk.
    kTfLiteArrayType* array =
        reinterpret_cast<kTfLiteArrayType*>(allocator->AllocateFromTail(
            TfLiteIntArrayGetSizeInBytes(flatbuffer_array->Length()),
            alignof(kTfLiteArrayType)));

Building microlite fails due to missing include fixedpoint.h

@tensorflow/micro

System information

Host OS Platform and Distribution: Ubuntu 18.04
TensorFlow installed from (source or binary): source
Tensorflow version (commit SHA if source): aaa88b7
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): x86

Describe the problem
Running
$ make -f tensorflow/lite/micro/tools/make/Makefile microlite

results in the error:

In file included from tensorflow/lite/micro/kernels/activations.cc:18:0:
./tensorflow/lite/kernels/internal/common.h:26:10: fatal error: fixedpoint/fixedpoint.h: No such file or directory
 #include "fixedpoint/fixedpoint.h"

The issue seems to be that gemmlowp, which contains fixedpoint.h, isn't being downloaded:

$ ls tensorflow/lite/micro/tools/make/downloads/
flatbuffers/
person_model_int8/
pigweed/

A workaround to this issue is to run this command before trying to build microlite:

$ make -f tensorflow/lite/micro/tools/make/Makefile third_party_downloads
$ ls tensorflow/lite/micro/tools/make/downloads/
flatbuffers/
gemmlowp/
kissfft/
person_model_grayscale/
person_model_int8/
pigweed/
ruy/

Please provide the exact sequence of commands/steps when you ran into the problem

$ git clone https://github.com/tensorflow/tflite-micro.git
$ cd tflite-micro/
$ make -f tensorflow/lite/micro/tools/make/Makefile microlite

Label_image demo always gives same outputs

Hi all !

While testing our models with basic label_image example, we faced with strange results.

We tried label_image with 2 different YOLO models with their corresponding label.txt. However regardless the input images and labels, the test output is always in the same order as [2 3 0 1]. Finally we also gave the Grace Hopper image from official source and still we faced with same issue. You can see the outputs in the image below.

Normally our models are working smoothly on PC. But, it is important for us to test them in our iMX6 device with this basic sample. Do you have any idea or solution about this problem ?
Thank you in advance.

Build TFLite Micro for riscv32_mcu

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
TensorFlow installed from (source or binary): N/A
TensorFlow version: Tensorflow 2.4.2
Python version: 3.8.5
Installed using virtualenv? pip? conda?: N/A
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): gcc 9.3.0, riscv64-unknown-elf-gcc (GCC) 10.2.0
CUDA/cuDNN version: N/A
GPU model and memory: N/A
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): RISC-V

Describe the problem
Cannot build TFLite Micro for RISC-V target (TARGET=riscv32_mcu). The error message is shown at the bottom.
The command below doesn't work since the file riscv32_mcu_makefile.inc doesn't exist.
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=riscv32_mcu test_hello_world_test

Does anyone successfully build the hello_world example for RISCV (riscv32_mcu)?
What are the exact commands and steps?

Provide the exact sequence of commands / steps that you executed before running into the problem

git clone https://github.com/tensorflow/tflite-micro.git
cd tflite-micro/
vi tensorflow/lite/micro/tools/make/targets/mcu_riscv_makefile.inc
change ifeq ($(TARGET), riscv32_mcu) to ifeq ($(TARGET), mcu_riscv)
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=riscv32_mcu hello_world

Any other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

$ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=mcu_riscv TARGET_ARCH=riscv32_mcu test_hello_world_test
tensorflow/lite/micro/tools/make/downloads/flatbuffers already exists, skipping the download.
tensorflow/lite/micro/tools/make/downloads/pigweed already exists, skipping the download.
tensorflow/lite/micro/tools/make/downloads/person_model_int8 already exists, skipping the download.
riscv64-unknown-elf-g++ -std=c++11 -fno-rtti -fno-exceptions -fno-threadsafe-statics -Werror -fno-unwind-tables -ffunction-sections -fdata-sections -fmessage-length=0 -DTF_LITE_STATIC_MEMORY -DTF_LITE_DISABLE_X86_NEON -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wunused-function -Wswitch -Wvla -Wall -Wextra -Wstrict-aliasing -Wno-unused-parameter -DMCU_RISCV -march=rv32imac -mabi=ilp32 -mcmodel=medany -mexplicit-relocs -fno-builtin-printf -fno-exceptions -DTF_LITE_MCU_DEBUG_LOG -DTF_LITE_USE_GLOBAL_CMATH_FUNCTIONS -fno-unwind-tables -ffunction-sections -fdata-sections -funsigned-char -Wvla -Wall -Wextra -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wno-unused-parameter -Wno-write-strings -Wunused-function -fno-delete-null-pointer-checks -fno-threadsafe-statics -fomit-frame-pointer -fno-use-cxa-atexit -Os -fpermissive -fno-rtti --std=gnu++11 -Os -I. -Itensorflow/lite/micro/tools/make/downloads/gemmlowp -Itensorflow/lite/micro/tools/make/downloads/flatbuffers/include -Itensorflow/lite/micro/tools/make/downloads/ruy -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/include -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/drivers/ -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env/freedom-e300-hifive1 -Itensorflow/lite/micro/tools/make/downloads/kissfft -c tensorflow/lite/micro/riscv32_mcu/debug_log.cc -o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/riscv32_mcu/debug_log.o
riscv64-unknown-elf-g++ -std=c++11 -fno-rtti -fno-exceptions -fno-threadsafe-statics -Werror -fno-unwind-tables -ffunction-sections -fdata-sections -fmessage-length=0 -DTF_LITE_STATIC_MEMORY -DTF_LITE_DISABLE_X86_NEON -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wunused-function -Wswitch -Wvla -Wall -Wextra -Wstrict-aliasing -Wno-unused-parameter -DMCU_RISCV -march=rv32imac -mabi=ilp32 -mcmodel=medany -mexplicit-relocs -fno-builtin-printf -fno-exceptions -DTF_LITE_MCU_DEBUG_LOG -DTF_LITE_USE_GLOBAL_CMATH_FUNCTIONS -fno-unwind-tables -ffunction-sections -fdata-sections -funsigned-char -Wvla -Wall -Wextra -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wno-unused-parameter -Wno-write-strings -Wunused-function -fno-delete-null-pointer-checks -fno-threadsafe-statics -fomit-frame-pointer -fno-use-cxa-atexit -Os -fpermissive -fno-rtti --std=gnu++11 -Os -I. -Itensorflow/lite/micro/tools/make/downloads/gemmlowp -Itensorflow/lite/micro/tools/make/downloads/flatbuffers/include -Itensorflow/lite/micro/tools/make/downloads/ruy -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/include -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/drivers/ -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env/freedom-e300-hifive1 -Itensorflow/lite/micro/tools/make/downloads/kissfft -c tensorflow/lite/micro/kernels/activations.cc -o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/kernels/activations.o
In file included from tensorflow/lite/micro/kernels/activations.cc:18:
./tensorflow/lite/kernels/internal/common.h:26:10: fatal error: fixedpoint/fixedpoint.h: No such file or directory
26 | #include "fixedpoint/fixedpoint.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [tensorflow/lite/micro/tools/make/Makefile:649: tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/kernels/activations.o] Error 1

ValueError: Failed to parse the model: pybind11::init(): factory function returned nullptr.

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04): windows 10
Tensorflow version (commit SHA if source): Tensorflow 2.5.0

Describe the problem :

After a long search in other forums for my problem, any solution fund won’t work for me. I hope that you can help me to overcome this problem.
the problem is while doing post-training integer quantization of a GRU model, it gives me the following error :
ValueError: Failed to parse the model: pybind11::init(): factory function returned nullptr.
I tried TensorFlow 2.4.1, tf-nightly, but no one works.
I think the problem is with my representative_dataset , but it works with CNN 1D.
My code:

`converter = tf.lite.TFLiteConverter.from_saved_model(GRUMODEL_TF)

converter.optimizations = [tf.lite.Optimize.DEFAULT]

def representative_dataset_gen():
for sample in XX_data:
sample = np.expand_dims(sample.astype(np.float32), axis=0)
yield [sample]

converter.optimizations = [tf.lite.Optimize.DEFAULT]

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.inference_input_type = tf.int8

converter.inference_output_type = tf.int8

converter.representative_dataset = representative_dataset_gen2

model_tflite = converter.convert()

open(LSTMODEL_TFLITE, "wb").write(model_tflite)
`

Xtensa workflow does not appear to be catching errors from the PR

Here is a PR that resulted in the Xtensa build being broken:
#169

While the checks from that PR all passed, we can run the Xtensa workflow on main and see that it is failing:
https://github.com/tensorflow/tflite-micro/actions/runs/943815585

We are using the pull_request_target trigger:

tflite-micro/.github/workflows/xtensa.yml

Lines 10 to 14 in 36c507e

 on: 

 pull_request_target: 

 types: [labeled] 

 branches: 

 - main

And using the checkout action with the default ref param:

tflite-micro/.github/workflows/xtensa.yml

Line 34 in 36c507e

- uses: actions/checkout@v2

Port operator SPACE_TO_DEPTH from lite

This issue tracks the porting of operator SPACE_TO_DEPTH from lite.

This port was started in the tensorflow/tensorflow repository as issue #45824, and is migrating here in a half-completed state.

Remove support for asymmetric uint8 quantization in Tensorflow Lite Micro

@tensorflow/micro

As described in the TensorFLow Lite 8 bit quantization specification:
"Note: In the past our quantization tooling used per-tensor, asymmetric, uint8 quantization. New tooling, reference kernels, and optimized kernels for 8-bit quantization will use this spec."

More detailed analysis can be read in this whitepaper.

TfLite Micro will be removing support for the asymmetric uint8 quantization. This will enable TFLM to focus on the fully supported quantization tooling, avoid sending users down the uint8 path and also have the potential for reduced binary size.

Note that this change is only for TfLite Micro and not TfLite.

micro: port op L2_POOL_2D from lite

@tensorflow/micro

This issue tracks my work porting operator L2_POOL_2D from lite to micro.

The port will be submitted in a number of PRs. Here's a rough flight plan per @advaitjain and @petewarden:

PR 1 (step 1): Extract the code for parsing the op from a flatbuffer out of ParseOpDataTfLite in tensorflow/lite/core/api/flatbuffer_conversions.cc into a standalone function that can be called from micro's op resolver
PR 2 (step 2): Extract the reference implementation out of tensorflow/lite/kernels/internal/reference/reference_ops.h into its own header which can be included without dragging in reference_ops.h's dependences

The next 3 steps are combined into a single PR3 with separate commits:

(step 3): Copy operator from lite to micro making minimal changes and not including in the build
(step 4): Delete extra code from the micro copy of the operator
(step 5): Port micro copy of operator as necessary and add a corresponding test

CMSIS-NN: Add support for dilation

@tensorflow/micro

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary):source
Tensorflow version (commit SHA if source):712fd5cfe4a5de420a4226c874664423eba4cb1a
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): All platforms

Describe the problem
For non-unity dilation values, the reference kernel is used in both convolution and depthwise convolution in the case of CMSIS-NN usage. this is to be removed when support for dilation is added to CMSIS kernels.

Please provide the exact sequence of commands/steps when you ran into the problem
Visual code inspection of cmsis-nn/conv.cc and cmsis-nn/depthwise_conv.cc to see that there is a protection for unity dilation case.

Trouble using Make to build examples.

I'm seeing build errors with the different TFLM build attempts I've been making. I have not tried a Bazel build, if there are build instructions on how to do a Bazel build for this repo on Windows I would try that, currently I am trying to build with a Make project. I am successfully building Tensorflow Lite (from the main Tensorflow repo) with CMake, but with that build I don't have clarity on what is just needed for TFLM.

Build attempt #1:

Following the first Micro Speech example "Deploy to ARC EM SDP" found here:
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech

Command line commands from an empty folder:
git clone https://github.com/tensorflow/tflite-micro tflite-micro
cd tflite-micro
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=arc_emsdp ARC_TAGS=reduce_codesize OPTIMIZED_KERNEL_DIR=arc_mli generate_micro_speech_mock_make_project

Result: Several packages are downloaded, but the end result is "make: *** No rule to make target 'generate_micro_speech_mock_make_project'. Stop."

Build attempt #2:

Following the Micro Speech example "Deploy to ESP32" found here:
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech

ESP IDF should be properly installed.

Result:
There are a couple "FIND: Parameter format not correct" messages and "File not found - *.cc", "File not found - *.h". It appears that pigweed is downloaded, along with some other packages. Ultimately, the build fails with "make: *** No rule to make target 'generate_micro_speech_esp_project'. Stop.".

Build attempt #3:

Following the Micro Speech example "Deploy to Sparkfun Edge" found here:
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech

Command line commands from an empty folder:
git clone https://github.com/tensorflow/tflite-micro tflite-micro
cd tflite-micro
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=sparkfun_edge OPTIMIZED_KERNEL_DIR=cmsis_nn micro_speech_bin

Results: Essentially the same results as build attempt #2 above.

Build attempt #4:

Following the Micro Speech example "Deploy to STM32F746" found here:
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/micro_speech

Command line commands from an empty folder:
git clone https://github.com/tensorflow/tflite-micro tflite-micro
cd tflite-micro
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=disco_f746ng OPTIMIZED_KERNEL_DIR=cmsis_nn generate_micro_speech_mbed_project

Same results as above.

Build attempt #5:

Following the Hello World example "Deploy to SparkFun Edge" found here:
https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/examples/hello_world

Same results as above.

Some tests are excluded for cortex_m_corstone_300_makefile.inc

@tensorflow/micro

System information

Host OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary):
Tensorflow version (commit SHA if source):
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.):

Describe the problem
I hit this issue when working with: tensorflow/tensorflow#46830
Some tests are excluded because they are not working.

Please provide the exact sequence of commands/steps when you ran into the problem
Try to run some of the excluded test for the cortex_m_corstone_300_makefile.inc target.

Group Convolution Support

Any plan on supporting group convolutions in this repo? I would be interested in working on this also. Just wanted to verify it wasn't already in progress before I attempted it.

Port operator ABS for int8 and int16 to TFLu

@advaitjain This tracks porting of int8 and int16 support in ABS kernel.

It will be delivered as one PR.

Build TFLite Micro for riscv32_mcu

This issue is different from #202

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
TensorFlow installed from (source or binary): N/A
TensorFlow version: Tensorflow 2.4.2
Python version: 3.8.5
Installed using virtualenv? pip? conda?: N/A
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): gcc 9.3.0, riscv64-unknown-elf-gcc (GCC) 10.2.0
CUDA/cuDNN version: N/A
GPU model and memory: N/A
Target platform (e.g. Arm Mbed OS, Arduino Nano 33 etc.): RISC-V

Describe the problem
Cannot build TFLite Micro for RISC-V target (TARGET=riscv32_mcu). The error message is shown at the bottom.
The command below doesn't work since the file riscv32_mcu_makefile.inc doesn't exist.
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=riscv32_mcu hello_world

Does anyone successfully build the hello_world example for RISCV (riscv32_mcu)?
What are the exact commands and steps?

Provide the exact sequence of commands / steps that you executed before running into the problem

git clone https://github.com/tensorflow/tflite-micro.git
cd tflite-micro/
make -f tensorflow/lite/micro/tools/make/Makefile third_party_downloads
vi tensorflow/lite/micro/tools/make/targets/mcu_riscv_makefile.inc
change ifeq ($(TARGET), riscv32_mcu) to ifeq ($(TARGET), mcu_riscv)
Move two flags: -fno-threadsafe-statics -fno-use-cxa-atexit from PLATFORM_FLAGS to CXXFLAGS
make -f tensorflow/lite/micro/tools/make/Makefile PARSE_THIRD_PARTY=true TARGET=mcu_riscv TARGET_ARCH=riscv32_mcu generate_hello_world_make_project
make -f tensorflow/lite/micro/tools/make/Makefile TARGET=mcu_riscv TARGET_ARCH=riscv32_mcu hello_world

$ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=mcu_riscv TARGET_ARCH=riscv32_mcu hello_world
tensorflow/lite/micro/tools/make/downloads/flatbuffers already exists, skipping the download.
tensorflow/lite/micro/tools/make/downloads/pigweed already exists, skipping the download.
tensorflow/lite/micro/tools/make/downloads/person_model_int8 already exists, skipping the download.
riscv64-unknown-elf-g++ -std=c++11 -fno-rtti -fno-exceptions -fno-threadsafe-statics -Werror -fno-unwind-tables -ffunction-sections -fdata-sections -fmessage-length=0 -DTF_LITE_STATIC_MEMORY -DTF_LITE_DISABLE_X86_NEON -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wunused-function -Wswitch -Wvla -Wall -Wextra -Wstrict-aliasing -Wno-unused-parameter -DMCU_RISCV -march=rv32imac -mabi=ilp32 -mcmodel=medany -mexplicit-relocs -fno-builtin-printf -fno-exceptions -DTF_LITE_MCU_DEBUG_LOG -DTF_LITE_USE_GLOBAL_CMATH_FUNCTIONS -fno-unwind-tables -ffunction-sections -fdata-sections -funsigned-char -Wvla -Wall -Wextra -Wsign-compare -Wdouble-promotion -Wshadow -Wunused-variable -Wmissing-field-initializers -Wno-unused-parameter -Wno-write-strings -Wunused-function -fno-delete-null-pointer-checks -fomit-frame-pointer -Os -fpermissive -fno-rtti -fno-threadsafe-statics -fno-use-cxa-atexit --std=gnu++11 -I. -Itensorflow/lite/micro/tools/make/downloads/gemmlowp -Itensorflow/lite/micro/tools/make/downloads/flatbuffers/include -Itensorflow/lite/micro/tools/make/downloads/ruy -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/include -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/drivers/ -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env -Itensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env/freedom-e300-hifive1 -Itensorflow/lite/micro/tools/make/downloads/kissfft -o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/bin/hello_world tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/examples/hello_world/main.o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/examples/hello_world/main_functions.o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/examples/hello_world/model.o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/examples/hello_world/output_handler.o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/obj/core/tensorflow/lite/micro/examples/hello_world/constants.o tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/lib/libtensorflow-microlite.a -Wl,--fatal-warnings -Wl,--gc-sections -Ttensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env/freedom-e300-hifive1/flash.lds -nostartfiles -Ltensorflow/lite/micro/tools/make/downloads/sifive_fe310_lib/bsp/env --specs=nano.specs -lm
/opt/riscv/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: warning: cannot find entry symbol _start; not setting start address
collect2: error: ld returned 1 exit status
make: *** [tensorflow/lite/micro/examples/hello_world/Makefile.inc:44: tensorflow/lite/micro/tools/make/gen/mcu_riscv_riscv32_mcu_default/bin/hello_world] Error 1

Clean up and share more code between reference and optimized svdf implementations

This came up during the review for tensorflow/tensorflow#47914 but is not unique to that particular PR.

The underlying issue appears to be that the reference implementation of svdf can likely be refactored into smaller function which can then be shared between the optimized and reference implementations.

SVDF is a very specific operator only really used for one model, so the refactor is not high priority at this time.

MicroProfiler difficult to subclass

I would like to make a subclass of tflite::MicroProfiler which formats data differently. In particular, I would like CSV output to make it easy to use performance data in a spreadsheet.

However, since the data members of MicroProfiler are all private, and there are no accessors, it is difficult to build this as a subclass.

Here are several alternatives:

Make the data members protected instead of private.
Provide accessor methods so that the data collected by the profiler can be used by subclasses (and possibly by other classes too)
Split Split MicroProfiler into an interface and a separate implementation.
Make a custom version of MicroProfiler.

What is the best option here? I'm happy to send a PR if that would be helpful.

The kernel sources are compiled into both MICROLITE_LIB_OBJS and MICROLITE_KERNELS_OBJS

System information

Host OS Platform and Distribution: Ubuntu 18.04
TensorFlow installed from: source
Tensorflow version (commit SHA if source): e172d86

Describe the problem
We've noticed that the kernel sources seems to be compiled into both the MICROLITE_LIB_OBJS and MICROLITE_KERNEL_OBJS. This leads to the kernels sources being compiled with both CORE_OPTIMIZATION_LEVEL (-Os) and KERNEL_OPTIMIZATION_LEVEL (-O2) depending on which object file they're found in.

This might lead to the final binary using kernel source being built with -Os, instead of -O2 which means that the execution of the kernels will be slower. armclang will also display a warning saying L6876W: Minor variants of archive member '<member>' include multiple base variants

Removing line 612 from tensorflow/lite/micro/tools/make/Makefile seems to solve the issue at first glance.

Please provide the exact sequence of commands/steps when you ran into the problem
$ make -f tensorflow/lite/micro/tools/make/Makefile TARGET=cortex_m_corstone_300 TARGET_ARCH=cortex-m55 BUILD_TYPE=release_with_logs TOOLCHAIN=armclang test_hello_world_test

Add Arm Compiler 6 support for Corstone-300

Today it's possible to build the Corstone-300 target using GCC. We'd like to add support for Arm Compiler 6 as well. In order to be able to build the Corstone-300 target using the Arm Compiler 6 toolchain, a few modification has to be made.

The bin recipe will utilize the fromelf command instead of objcopy for generating the binary
I'll add and armclang toolchain switch in the Corstone-300 makefile. That'll contain all the linker and compiler flag needed.

We'd like to be able to compile the Corstone-300 target by calling e.g.

make -j -f tensorflow/lite/micro/tools/make/Makefile TARGET=cortex_m_corstone_300 TARGET_ARCH=cortex-m55 TOOLCHAIN=armclang hello_world_bin

or execute a test on your host with:

make -j -f tensorflow/lite/micro/tools/make/Makefile TARGET=cortex_m_corstone_300 TARGET_ARCH=cortex-m55 TOOLCHAIN=armclang test_hello_world_test

Problems to register a custom op in TF Lite

System information

OS Platform and Distribution: Linux Ubuntu 20.10
TensorFlow version: 2.4.1
Python version : 3.8 (installed via pip)

Describe the problem

I'd like to create a model which uses a custom op and two builtin ops.

For the implementation of the custom op I added a .cpp file and a .h file in tensorflow/lite/micro/kernels.

The header file for the custom op is like this:

    #ifndef TENSORFLOW_LITE_KERNELS_CONV1D_H_
    #define TENSORFLOW_LITE_KERNELS_CONV1D_H_

    #include "tensorflow/lite/kernels/internal/types.h"
    #include "tensorflow/lite/kernels/kernel_util.h"

    namespace tflite {
    namespace ops {
    namespace custom {

    TfLiteRegistration* Register_CONV_1D();

     }  // namespace custom
     }  // namespace ops
     }  // namespace tflite

    #endif  // TENSORFLOW_LITE_KERNELS_CONV1D_H_

I'm not sure if I'm missing some other steps to register the custom op.

Update docker image used as part of github actions to be hosted on github packages.

#10 makes use of the docker image jpwithers/tflite-micro-tests (here).

Creating the current issue to track the work needed to have the docker image be hosted on GitHub packages.

Community Question: Documentation of Benchmark Model Architectures

Hi all, I hope it is okay to use your issues channel to ask a question about the benchmark models! I am looking at the benchmark models in tflite-micro/tensorflow/lite/micro/benchmarks, namely the person-detection model and the keyword spotting model. I'm curious to learn more about these model architectures, i.e. what are the hidden sizes, kernel sizes, etc. behind each of these architectures.

I'm able to deduce from training_a_model.md that the person detection model is a mobilenet_v1 architecture, presumably with grayscale 96x96 input images. Assuming that the architecture is just a standard mobilenet_v1 with the given input image size (96x96x1) and 2 output classes ('person' and 'not a person'), I should be able to fill in all of the details about the architecture. Could you confirm if this is indeed the correct architecture?

It is harder to fill in the architectural details for the keyword-spotting architecture. It appears from keyword_benchmark.cc that the architecture is
FC -> Quantize -> Softmax -> SVDF (probably using this SVDF layer)
However, in this code, the low-latency-svdf architecture seems to quite different from the above. The create_low_latency_svdf_model function provides enough detail that I can figure out the architectural details, if this is indeed the code that is being used to define the KWS benchmark model. I'd appreciate if someone could clarify which of the two architectures is actually being tested by the kws benchmark model.

Thank you!

	// Convert original image to simpler format:
	// convert -resize 96x96\! person.PNG person.bmp3
	// Skip the 54 byte bmp3 header and add the reset of the bytes to a C array:
	// xxd -s 54 -i /tmp/person.bmp3 > /tmp/person.cc

tensorflow / tflite-micro Goto Github PK

tflite-micro's Introduction

TensorFlow Lite for Microcontrollers

Build Status

Official Builds

Community Supported TFLM Examples

Community Supported Kernels and Unit Tests

Contributing

Getting Help

Additional Documentation

RFCs

tflite-micro's People

Contributors

Stargazers

Watchers

Forkers

tflite-micro's Issues

Describe the problem

Source code / logs

Recommend Projects

Recommend Topics

Recommend Org