Git Product home page Git Product logo

dockerdl's Introduction

Hi! My name is Atif. I am a graduate student in Electrical Engineering at Bilkent University. Most of my current research is driven by trying to find robust training methods for analog neural networks, specifically for applications in wireless communications.

Skills

  • Product Management
  • Docker, Kubernetes, Terraform, Ansible
  • Python, MATLAB
  • PyTorch, TensorFlow
  • Currently learning: Go, JavaScript

My Contributions

GitHub Streak

๐Ÿค Please feel free to get in touch!

Muhammad Atif Ali | Twitter Muhammad Atif Ali | LinkedIn Atif | Instagram matifali.dev

dockerdl's People

Contributors

darigovresearch avatar dependabot[bot] avatar matifali avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dockerdl's Issues

FROM matifali/dockerdl-base:11.7.1 still picks image with CUDA 12.1.1

Describe the bug

TF Image is still using CUDA 12.1.1. Looks like this line didn't do what was expected

FROM matifali/dockerdl-base:11.7.1

I also tried 11.8.0 per your commit earlier. Maybe this line forces to use 12.1.1

matrix.CUDA_VER == '12.1.1'?

in docker-publish-base.yml

To Reproduce
Pull latest tf image. bash into it - message still says CUDA 12.1.1 is being used

==========
== CUDA ==
==========

CUDA Version 12.1.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Expected behavior
CUDA Version 11.7.1 is used

TF 2.13 in docker images doesnt work with GPU

Describe the bug

Hello,

Thanks for putting these together, makes it easy to get started. One issue I found: TF doesnt use GPU even when its available. I think this is due to CUDA 12.1.1. When I switched to CUDA 11.7.1 - TF does use GPU.

ARG CUDA_VER=11.7.1
ARG UBUNTU_VER=22.04
ARG PYTHON_VER=3.10

Here is how I tested:

sudo docker run --rm --runtime=nvidia --gpus all matifali/dockerdl python3 -c 'import tensorflow as tf; devices = [d.device_type for d in tf.config.list_physical_devices()]; print("Available devices: %s", devices)'

Output:

CUDA Version 12.1.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

2023-07-13 15:38:30.515088: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 15:38:30.564382: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 15:38:30.565062: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-13 15:38:31.662344: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-07-13 15:38:35.089274: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-07-13 15:38:35.122639: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1960] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Available devices: %s ['CPU']

Output of same command when using CUDA 11.7.1:

sudo docker run --rm --runtime=nvidia --gpus all  my-tf-image python3 -c 'import tensorflow as tf; devices = [d.device_type for d in tf.config.list_physical_devices()]; print("Available devices: %s", devices)'

==========
== CUDA ==
==========

CUDA Version 11.7.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

2023-07-13 15:50:44.533017: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-13 15:50:45.743829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-07-13 15:50:49.414871: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-07-13 15:50:49.461339: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-07-13 15:50:49.463117: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:995] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
Available devices: %s ['CPU', 'GPU']

To Reproduce

Use docker command on VM configured to use GPU.
sudo docker run --rm --runtime=nvidia --gpus all matifali/dockerdl python3 -c 'import tensorflow as tf; devices = [d.device_type for d in tf.config.list_physical_devices()]; print("Available devices: %s", devices)'

Expected behavior

GPU is in the list of supported devices

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.