Git Product home page Git Product logo

tf-approximate's Introduction

TensorFlow Approximate Layers

Overview

This library extends TensorFlow library providing Approximate Convolutional (ApproxConv) layers, i.e. layers with reduced precision (typically 8 bits) implemented using approximate circuits (multipliers). The proposed layer enables to specify via parameter which approximate multiplier should be used (e.g. a multiplier from the EvoApproxLib). To maximize the throughput, the layer expects availability of a model of approximate multiplier in form of a Truth Table.

The usage of the approximate components in convolutional neural networks enables to reduce the complexity of CNNs/DNNs when implemented in HW. This problem is discussed, e.g. in 10.1145/2966986.2967021 or arXiv:2002.09481.

Accelerated version (TensorFlow 2, GPU, optionally CPU)

This library extends TensorFlow library by ApproxConv2DWithMinMaxVars layer that implements Conv2D with approximate multiplier. The layer is intended to be used together with FakeQuantWithMinMaxVars layers (on inputs) to experiment with approximate/quantized convolutions in FP32 CNNs. The code can be executed on GPU or CPU but it is recommended to use a GPU to maximize the throughput.

Application overview

This is the most recent version of the approximate layers for TensorFlow. This implementation provides ~ 200x speedup with respect to the previous CPU-based version. We published the source codes as well as docker or singularity container with pre-build TensorFlow and the libraries for NVIDIA GPUs. The source codes and application examples are given in the tf2 folder. For more details please see the paper arXiv:2002.09481

Performance of the accelerated version

Speed comparison Note that the evaluation was performed on Intel Xeon E5-2620 CPU equipped with NVIDIA GTX 1080 GPU, TensorFlow 1.X, and NVIDIA CUDA Toolkit 10.1.

F. Vaverka, V. Mrazek, Z. Vasicek and L. Sekanina. "TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU". 2020 Design, Automation and Test in Europe Conference (DATE), Grenoble, FR, 2020.

@INPROCEEDINGS{8942068,
    author={F. {Vaverka} and V. {Mrazek} and Z. {Vasicek} and L. {Sekanina} and M. A. {Hanif} and M. {Shafique}},
    booktitle={2020 Design, Automation and Test in Europe Conference (DATE)},
    title={TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU},
    year={2020},
    volume={},
    number={},
    pages={4},
}

Basic implementation (TensorFlow 1.14, CPU only)

This repository provides two versions of the approximate layers. The first is based on a simple CPU implementation from the TensorFlow library and is located in tf1 folder. In this version, a AxConv2D layer is implemented, that extends QuantizedConv2D layer with approximate multiplier. The basic usage is shown in the README file.

For more details see paper: 10.1109/ICCAD45719.2019.8942068 or arXiv:1907.07229 . If you use this library in your work, please use a following reference

V. Mrazek, Z. Vasicek, L. Sekanina, M. A. Hanif and M. Shafique, "ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining," 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Westminster, CO, USA, 2019, pp. 1-8.

@INPROCEEDINGS{8942068,
    author={V. {Mrazek} and Z. {Vasicek} and L. {Sekanina} and M. A. {Hanif} and M. {Shafique}},
    booktitle={2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)},
    title={ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining},
    year={2019},
    volume={},
    number={},
    pages={1-8},
    keywords={approximate computing;deep neural networks;computational path;ResNet;CIFAR-10},
    doi={10.1109/ICCAD45719.2019.8942068},
    ISSN={1933-7760},
    month={Nov},
}

tf-approximate's People

Contributors

mrazekv avatar zvasicek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tf-approximate's Issues

Where is the libApproxGPUOpsTF.so by building from source?

Hi,
I run the script as follow:
$mkdir build
$cd build
$cmake -DTFAPPROX_ALLOW_GPU_CONV=OFF ..
$make
and it succeeds finally, just with some warmings(when $make).
......
[100%] Linking CXX shared library libApproxGPUOpsTF.dylib
[100%] Built target ApproxGPUOpsTF
But I could not find where libApproxGPUOpsTF.so is?
The folder ‘build’ contains
CMakeCache.txt ,Makefile , containers ,src ,
CMakeFiles , cmake_install.cmake, libApproxGPUOpsTF.dylib.

Singularity container download timing out

Hi,

I am trying to download the singularity container for the GPU implementation.
Unfortunately, after 30 minutes of downloading the download stops with the following message:

FATAL: While pulling from image from http(s): context deadline exceeded (Client.Timeout or context cancellation while reading body)

I tried looking at my singularity.conf file but cannot find a timeout configuration for the client.

tf1 compiling error

Hi, my name is Michele,
thanks for your open-source code. I was looking for a solution for a project and the tf1 library seems to be a great one. Unfortunately, I encountered some problems compiling the library:

  • I had to change std=c++11 to std=c++14 in these two lines in the Makefile:
axmult.o: axmult.cc
	g++ -std=c++14 $^ -c -o $@

axqconv.so: axqconv.cc axmult.cc axmult.h approximate_selector.h
	g++ -std=c++14 -Wno-ignored-attributes -Wno-deprecated-declarations  -shared $(filter-out %.h, $^) -o $@ -fPIC ${TF_CFLAGS} ${TF_LFLAGS} -O2

Could you help me? :)

TFapprox build with tensorflow 2.3

Hi! I want to build TFapprox with tensorflow 2.3 but it gives me error with cmake. I was wondering what could be the other requirements for building the environment with tensorflow 2.3. I mean, cuda, cudnn, cmake, GCC etc. versions?

Looking forward to your response.

Gradient registration

In file python/keras/layers/fake_approx_convolutional.py:

When registering gradient for approximate convolution layer you should replace:

@ops.RegisterGradient("FakeApproxConv2D")

with

@tf.RegisterGradient("ApproxConv2DWithMinMaxVars")

RegisterGradient function takes name of the TF operator not the layer.

Greetings
Ratko

Changes in approximate convolution

Hi, I was wondering if approx_nn_conv_ops_ref.cpp needs to be edited and then, the libApproxGPUOpsTF.so need to be created again if we want to make any changes in the convolution. It seems like in line 300 of this file, prodValueQ is meant for storing the approximate multiplication result in convolution.

I tried to make some changes in convolution there and run the cmake and make commands again but its quite strange that no changes affect the accuracy.

Training

Hello,

Thanks for old reply for my issue. I try my addition unit and observe the result. I see error rate cause high error rates for the network. So, I want to train network wit approximate unit.

Is it possible in this framework? Could you do such thing?

Thanks in advance,

Dead container link

Hello,

while trying to install the tf2 version of tf-approximate I ran into a dead link:

singularity pull https://ehw.fit.vutbr.cz/tf-approximate/tf-approximate-gpu.sif
# or wget command can be used
wget https://ehw.fit.vutbr.cz/tf-approximate/tf-approximate-gpu.sif
# or just download the file using web browser

wont work as the link to the .sif file is broken and does not work anymore.
Additionally, Singularity (which now seems to be called Apptainer) has its own broken installation and is not available in the Debian/Ubuntu apt repositories.

I'm kindly asking for a re-upload.

Furthermore, installing from source is not fully explained, as for example the make command currently gives me errors regarding missing or changed parameters of used functions (I assume the API has changed over time and thus is only compatible to certain tensorflow verisons). That could be circumvented if the container is available.

Hence, currently it is a bit difficult to get tf-approximate running. I'm welcome for any hints of what I can try to get it running too.

Building from source

Greetings,

I tried to build FakeApproxConv2D from source files in Singularity and Docker container which are based on tensorflow/tensorflow:2.1.0-gpu-py3 image. In both cases build passes but I get low accuracy while classifying MNIST dataset using script examples/examples/fake_approx_eval.py. On the other hand, if I use prebuilt singularity container, everything works well. Unfortunatelly, cmake isn't installed in published singularity container.

GPU evaluate is not work?

Hi my name is moh.
Thank for your opening source. I followed the steps you gave to build the environment, but the result is incorrect when using GPU for evaluation, and the result is correct when using CPU evaluation.

error

using one Approximate Multiplier

hello, I want to express my gratitude for sharing your open source code. I came across your project while searching for solutions related to approximate multipliers, and I'm eager to incorporate your code into my work.

Given my specific requirements, I prefer not to rely on libraries that include pre-built multipliers. Could you kindly provide guidance on how I can effectively integrate your code into my project?

[tf1] fatal error: 'tensorflow/core/framework/op_kernel.h' file not found

Hi, my name is John Zhou. I really appreciate that you could open source. When I ran the Makefile in the tf1 fold, there was a error as follow:

(tensorflow) zhouhang@zhouhangdeMacBook-Pro axqconv % make                     
(echo "//automatically generated by makefile" ; cat axmult/mul8u_125K.c axmult/mul8u_12N4.c axmult/mul8u_13QR.c axmult/mul8u_1446.c axmult/mul8u_14VP.c axmult/mul8u_150Q.c axmult/mul8u_17C8.c axmult/mul8u_17KS.c axmult/mul8u_17QU.c axmult/mul8u_185Q.c axmult/mul8u_18DU.c axmult/mul8u_199Z.c axmult/mul8u_19DB.c axmult/mul8u_1AGV.c axmult/mul8u_1JFF.c axmult/mul8u_2AC.c axmult/mul8u_2HH.c axmult/mul8u_2P7.c axmult/mul8u_7C1.c axmult/mul8u_96D.c axmult/mul8u_CK5.c axmult/mul8u_DM1.c axmult/mul8u_EXZ.c axmult/mul8u_FTA.c axmult/mul8u_GS2.c axmult/mul8u_JQQ.c axmult/mul8u_JV3.c axmult/mul8u_KEM.c axmult/mul8u_L40.c axmult/mul8u_NGR.c axmult/mul8u_PKY.c axmult/mul8u_QJD.c axmult/mul8u_QKX.c axmult/mul8u_Y48.c axmult/mul8u_YX7.c axmult/mul8u_ZFB.c) > axmult.cc
python generate_header.py axmult.cc > axmult.h
g++ -std=c++11 -Wno-ignored-attributes -shared axqconv.cc axmult.cc -o axqconv.so -fPIC   -O2
axqconv.cc:25:10: fatal error: 'tensorflow/core/framework/op_kernel.h' file not
      found
#include "tensorflow/core/framework/op_kernel.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [axqconv.so] Error 1

I think that maybe the version of tensorflow or install environment result in the error. Anaconda + Python3.7 + Tensorflow2.1 is my environment. And I checked out the path, /Users/zhouhang/Anaconda3/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow_core/core/framework, op_kernel.h could not be found.
Could you tell me how you build your environment, including python version, tensorflow version and so on. Thank you so much.

Problem in using the singularity container

Hi! I tried to use the singularity container with the following command on Ubuntu 20.04 (installed inside Windows machine).
singularity shell -H $(pwd) --nv --bind . ../../tf-approximate-gpu.sif
However, it generates the following warnings and error:

WARNING: failed to set O_CLOEXEC flags on image
WARNING: failed to set O_CLOEXEC flags on image
ERROR : Failed to set securebits: Invalid
ERROR : Failed to set securebits: Invalid argument

It would be great if you could help in resolving this issue.

Changes for floating point multiplier

Dear authors, Thank you for making your idea open source. I looked through the issues raised earlier and could not find similar points raised.
I was going through the repository and saw that we tested the results for 8-bit multipliers. However, for my work, I require 32-bit floating point multipliers. As per my understanding, two changes will be needed. Could you please let me know what more changes might need to be made? Thanks in advance!

  1. Removing quantization-related functions in FakeApproxConv2D (present in tf2/python/keras/layers/fake_approx_convolutional.py) to allow floating point multipliers.
  2. Creating the binary file for the multiplier by changing the range from 256 (2^8) with the floating point range (2^32), as shown below.
FILE * f = fopen("output.bin", "wb");

for(unsigned int a = 0; a < (2^32); a++)
    for(unsigned int b = 0; b < (2^32); b++) {
      long val = approximate_mult(a, b); // replace by your own function call
      fwrite(&val, sizeof(uint16_t), 1, f);
    }

fclose(f);

Could the signed 8*8 multiplier work?

Hi, my name is John.
Thank for your opening source. In your code, the quantization of weight and activation are both UINT8[0, 255] , so unsigned 88 multipilers are used. If I would like to quantize the weight and activation to [-127, 127], using the signed 88 multipliers, how to adjust the code could achieve the aim?Thanks again.

Gradient Implementation

Great Work! Just some questions. For the gradient part, I did not see the actual implementation for gradient operation, but only register using official implementation. If I missed it, please correct me. In this case, are you using approximation only in forwarding stage? Thanks.

Any chance to change addition?

Hello,
First of all thanks for your work. This is important framework for who work approximate units. However, is there any chance or guidance to change only addition to approximate addition. I checked your code and it is made multiplication in axqconv.cc line:242.
If I change total addition with
" total +=
axm -
static_cast(input_source_value) * filter_offset -
static_cast(filter_source_value) * input_offset +
input_offset * filter_offset;"

with approximate addition, is this a proper change?
Thanks in advance.

[tf2]Does kernel size=(1, 1) work ?

Hi, my name is hai-hai
Thank you for your opening source very much.
I tried to work on resnet54, but found that aproximate result is greatly affect when the kernel=(1 ,1), such as FakeApproxConv2D(kernel_size=(1,1)), I want to know whether the convolution behavior is defined when kernel=(1, 1).
Look forward to your answer and best wish

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.