Comments (6)
Hi,
could you provide us with CUDA version and GPU you used?
You could also try to build whole container assuming you are able to get project configured on the host machine (CUDA/GPU is not required for this). However, the host machine needs to have docker and singularity installed, and you have to be able to use "sudo".
To do this you would take similar approach as if you were building from the source directly on the host machine up to the "cmake .." step. Then with configured project you should be able to run "make tf-approximate-gpu-container" instead of usual "make".
This should pull "tensorflow/tensorflow:latest-gpu-py3" docker image, install dependencies in it and build FakeApproxConv2D. Then singularity is used to pull clean "tensorflow/tensorflow:latest-gpu-py3" image once more and only resulting binaries are added to it for the release (this is exactly the same process we use to build the container).
from tf-approximate.
Hi,
I built my docker container using docker image tensorflow/tensorflow:2.1.0-gpu-py3. After that, I installed Cmake and pillow and copied folders python/ and test/ in /opt/tf-approximate-gpu/ and set the enviorment variables LD_LIBRARY_PATH and PYTHON_PATH like in singularity def file. Then i built the library libApproxGPUOpsTF.so and copied to /opt/tf-approximate-gpu/ folder. The build process gave me a lot of warnings, you can find them in attached file output.txt. At the end I tried to execute python scripts in example folder. The training goes well (train_out.txt), while the evaluation gives me small classification accuracy (eval_out.txt)
CUDA version = Cuda compilation tools, release 10.1, V10.1.243
GPU = GeForce GTX 1080 Ti computeCapability: 6.1
I tried to build container using cmake, but I failed. Altough i have installed TF (CPU version), the cmake does not recognise it on system.
Thanks for the help
Ratko
cuda.txt
eval_out.txt
train_out.txt
output.txt
from tf-approximate.
I tried to replicate your workflow and it seems to be working fine for me (albeit only with GTX 950M). I would suggest to try to run "test_table_approx_conv_2d.py" from "test" with "--device cpu:0" and "--device gpu:0" (this also requires "libApproxGPUOpsTF.so" in "LD_LIBRARY_PATH"). This is perhaps the simplest test of the convolutional layer so we eliminate as much variables as we can.
Beyond that I will probably have to get hold of some GTX 1080 TI and try to isolate the issue.
from tf-approximate.
I rebuilt the container, and perform everything once more and run test_table_approx_conv_2d.py script. I got this output:
gpu:0: Linf Error: 0.9866220355033875
With CPU option I get around:
cpu:0: Linf Error: 2.411454147477343e-07
Greetings
Ratko
from tf-approximate.
I believe I found the cause of the issue. It seems that one gets such high error values when CUDA kernels are not compiled for CUDA Capability of given GPU - I haven't thought of that before as I would expect hard crash (we will have to look into this).
Either way, I think you should be able to fix the issue by compiling kernels for CUDA Capability 6.1 (GTX 1080 Ti). With our build setup you can do that by passing -DTFAPPROX_CUDA_ARCHS="61" ("." is omitted on purpose) to cmake or modify default value of the variable directly in "src/cuda/CMakeLists.txt". When compiling for multiple GPU the values in TFAPPROX_CUDA_ARCHS should be separated by semicolon.
from tf-approximate.
It worked.
Thanks for the help.
from tf-approximate.
Related Issues (20)
- Could the signed 8*8 multiplier work? HOT 1
- GPU evaluate is not work? HOT 2
- [tf2]Does kernel size=(1, 1) work ? HOT 2
- Problem in using the singularity container HOT 4
- Any chance to change addition? HOT 2
- Training
- Changes in approximate convolution
- tf1 compiling error HOT 1
- while running container showing following fatal error HOT 2
- TFapprox build with tensorflow 2.3
- using one Approximate Multiplier HOT 3
- sif file can not be found
- Changes for floating point multiplier HOT 1
- Dead container link HOT 1
- Where is the libApproxGPUOpsTF.so by building from source? HOT 2
- [tf2]How to generate the binary lookup table of approximate multiplier? HOT 7
- Gradient Implementation HOT 2
- Possible bug in IM2COL kernel HOT 1
- Gradient registration
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tf-approximate.