Git Product home page Git Product logo

v2's People

Contributors

bchen18 avatar bmwshop avatar brentbiseda avatar darraghdog avatar favrecr avatar jsrinivasa avatar prabsa avatar rbraddes avatar rdejana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

v2's Issues

HW5 - Move to TF2

I'd like to move HW5 to use Tensorflow 2 vs what we have now.
Thoughts?

Install Cuda Instructions in HW3 can be updated

New CUDA drivers are required to work with the nvidia-smi at the bottom of HW3, so the Install CUDA section can be updated for now to v10.1:

wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

HW4: Possible Typo

Is the following a typo and a possible fix for it?

Possible Typo
Name all the layers in parameters in the network

Possible Fix
Name all the layers and parameters in the network

w251/digits:tx2-4.2_b158 Error on Jetson Nano

(I could not find your container in GitHub, so I am posting here)
Thank you for this container image. I was able to start the DIGITS server on my NVIDIA Jetson Nano. Image load works fine, but attempting to create an Image Classification model leads to these errors. Looks like a CUDA issue. TIA.

2019-05-25 20:36:31 [20190525-203625-5f68] [INFO ] Create DB (train) task started.
2019-05-25 20:36:31 [20190525-203625-5f68] [INFO ] Task subprocess args: "/usr/bin/python2 /DIGITS/digits/tools/create_db.py /DIGITS/digits/jobs/20190525-203625-5f68/train.txt /DIGITS/digits/jobs/20190525-203625-5f68/train_db 768 1272 --backend=lmdb --channels=3 --resize_mode=crop --mean_file=/DIGITS/digits/jobs/20190525-203625-5f68/mean.binaryproto --mean_file=/DIGITS/digits/jobs/20190525-203625-5f68/mean.jpg --shuffle --encoding=jpg"
2019-05-25 20:36:31 [20190525-203625-5f68] [INFO ] Create DB (val) task started.
2019-05-25 20:36:31 [20190525-203625-5f68] [INFO ] Task subprocess args: "/usr/bin/python2 /DIGITS/digits/tools/create_db.py /DIGITS/digits/jobs/20190525-203625-5f68/val.txt /DIGITS/digits/jobs/20190525-203625-5f68/val_db 768 1272 --backend=lmdb --channels=3 --resize_mode=crop --shuffle --encoding=jpg"
2019-05-25 20:36:34 [20190525-203625-5f68] [WARNING] Create DB (train) unrecognized output: cudaRuntimeGetVersion() failed with error #38
2019-05-25 20:36:34 [20190525-203625-5f68] [WARNING] Create DB (train) unrecognized output: Tensorflow support disabled.
2019-05-25 20:36:34 [20190525-203625-5f68] [WARNING] Create DB (val) unrecognized output: cudaRuntimeGetVersion() failed with error #38
2019-05-25 20:36:34 [20190525-203625-5f68] [WARNING] Create DB (val) unrecognized output: Tensorflow support disabled.
2019-05-25 20:36:39 [20190525-203625-5f68] [DEBUG] 81 images written to database
2019-05-25 20:36:39 [20190525-203625-5f68] [INFO ] Create DB (val) task completed.
2019-05-25 20:36:46 [20190525-203625-5f68] [DEBUG] 246 images written to database
2019-05-25 20:37:06 [20190525-203625-5f68] [INFO ] Create DB (train) task completed.
2019-05-25 20:37:06 [20190525-203625-5f68] [INFO ] Job complete.
2019-05-25 20:38:57 [20190525-203823-f70d] [DEBUG] Network sanity check - train
2019-05-25 20:38:57 [20190525-203823-f70d] [DEBUG] Network sanity check - val
2019-05-25 20:38:57 [20190525-203823-f70d] [DEBUG] Network sanity check - deploy
2019-05-25 20:38:57 [20190525-203823-f70d] [INFO ] Train Caffe Model task started.
2019-05-25 20:38:57 [20190525-203823-f70d] [INFO ] Task subprocess args: "/caffe/build/tools/caffe train --solver=/DIGITS/digits/jobs/20190525-203823-f70d/solver.prototxt"
2019-05-25 20:38:58 [20190525-203823-f70d] [ERROR] Train Caffe Model: Cannot create Cublas handle. Cublas won't be available.
2019-05-25 20:38:58 [20190525-203823-f70d] [ERROR] Train Caffe Model: Cannot create Curand generator. Curand won't be available.
2019-05-25 20:38:58 [20190525-203823-f70d] [ERROR] Train Caffe Model: Cannot create cuDNN handle. cuDNN won't be available.
2019-05-25 20:38:58 [20190525-203823-f70d] [ERROR] Train Caffe Model: Check failed: error == cudaSuccess (38 vs. 0) no CUDA-capable device is detected
2019-05-25 20:38:59 [20190525-203823-f70d] [ERROR] Train Caffe Model task failed with error code -6

VMware instead of VirtualBox

In HW1 Virtual Box was really had to work with, VMware was much better and most of us eventually used VMware. It will be helpful if the instructions were updated to use VMware instead for future student. It was much faster as well.

Cannot run TF on CPU

The version of TF in the container is GPU-specific; It dumps core when running without --runtime=nvidia

wk02 lab2 COS endpoints

wondering if we should add a comment that if running in IBM Cloud, that it is best to use the COS private endpoints as they avoid going over the pubic network.

Remove or explain commented-out code in Dockerfile

There are unused imports in the example Dockerfile in week02/labs/README.md:

# FROM nvidia/cuda
# FROM nvidia/cuda:8.0-cudnn6-devel
# FROM nvidia/cuda:8.0-cudnn5-devel
--snip--
#  CMD jupyter notebook --no-browser --ip=0.0.0.0 --NotebookApp.token= --allow-root

These are unexplained and seem to be unused and can be confusing for students.

I suggest to remove the commented-out lines, or explain their raison d'être.

How to format external driver

Need to talk about the correct formatting of the external drive used by docker. For example, windows formatted drives may mount, but can't be used correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.