Git Product home page Git Product logo

mnist-cudnn's Introduction

cuda-for-deep-learning

Transparent CUDNN / CUBLAS usage for the deep learning training using MNIST dataset.

How to use

$ git clone https://github.com/haanjack/cudnn-mnist-training
$ cd cudnn-mnist-training
$ bash download-mnist-dataset.sh
$ make
$ ./train

Expected output

== MNIST training with CUDNN ==
[TRAIN]
loading ./dataset/train-images-idx3-ubyte
loaded 60000 items..
.. model Configuration ..
CUDA: conv1
CUDA: pool
CUDA: conv2
CUDA: pool
CUDA: dense1
CUDA: relu
CUDA: dense2
CUDA: softmax
.. initialized conv1 layer ..
.. initialized conv2 layer ..
.. initialized dense1 layer ..
.. initialized dense2 layer ..
step:  200, loss: 0.561, accuracy: 75.762%
step:  400, loss: 2.754, accuracy: 96.574%
step:  600, loss: 0.157, accuracy: 97.004%
step:  800, loss: 0.005, accuracy: 97.006%
step: 1000, loss: 0.178, accuracy: 97.016%
step: 1200, loss: 0.014, accuracy: 96.998%
step: 1400, loss: 0.854, accuracy: 96.998%
step: 1600, loss: 0.165, accuracy: 96.984%
step: 1800, loss: 0.051, accuracy: 97.006%
step: 2000, loss: 0.284, accuracy: 97.025%
step: 2200, loss: 0.002, accuracy: 96.996%
step: 2400, loss: 0.013, accuracy: 96.990%
[INFERENCE]
loading ./dataset/t10k-images-idx3-ubyte
loaded 10000 items..
loss: 3.165, accuracy: 85.500%
Done.

Features

  • Parameter saving and loading
  • Network modification
  • Learning rate modificiation
  • Dataset shuffling
  • Testing
  • Add more layers

All these features requires re-compilation

mnist-cudnn's People

Contributors

aethocesora avatar haanjack avatar linlll avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mnist-cudnn's Issues

Compiler error

#obj directory is not in the source file
SRC = src
OBJ_DIR = obj
so instead Makefile create directory
$(shell mkdir -p $(obj))
which is slow process
it would be better to put obj directory like src directory
simple fix is mkdir obj ^^

How to inference based on saved parameters without training procedure.

Hello, I want to inference based saved parameters without training.
When I comment out this code, inference result seems not correct. I'm not sure if just inference.

while (step < num_steps_train)
    {
       ....
        }
    }

Inference result as show as follows, load_pretrain is true.

== MNIST training with CUDNN ==
[TRAIN]
loading ./dataset/train-images-idx3-ubyte
loaded 60000 items..
.. model Configuration ..
CUDA: conv1
CUDA: pool
CUDA: conv2
CUDA: pool
CUDA: dense1
CUDA: relu
CUDA: dense2
CUDA: softmax
[INFERENCE]
loading ./dataset/t10k-images-idx3-ubyte
loaded 10000 items..
loss:    0, accuracy: 8.5%
Done.

How to build ResNet network to train with cuda

Although deep learning frameworks(e.g., TF, Pytorch,...) are commonly used to train models, I still want to try to build a Resnet model using Cudnn/ Cublas library for the deep learning training.
Could you give me some suggestions about how to create classic deep learning network(e.g., ResNet, ...) for cuda.
Thank you very much!
Actually, for resnet18, only Pad/AddV2/Reshape/Mean Ops are need to implement.

Did not use all the data(picture) during training or inferencing

Hi,
Thanks very much for your project, and I learnt a lot from it. But it seems that your project did not use all the data(picture) during training or inferencing. I modified the code a little to keep track of the index of all the pictures used:

  1. I defined a public variable in class MNIST: public: std::vector<int> idx_store;

  2. I stored all the indexes of picture used: (function void MNIST::get_batch())

    for (int i = 0; i < batch_size_; i++) {
        std::copy(data_pool_[data_idx + i].data(),
            &data_pool_[data_idx + i].data()[data_size],
            &data_->ptr()[data_size * i]);
        idx_store.push_back(data_idx + i); // added
    }
  3. After the training, I tried to find an index greater than 500, but didn't work:

    while (step < num_steps_train) {
        /* training... */
    }
    
    auto idx = train_data_loader.idx_store;
    for (int i = 0; i < idx.size(); i++) {
        if (idx[i] > 500) {
            std::cout << idx[i] << std::endl; // debug here
        }
    }

Then I figured out what the problem was: the following code in function void MNIST::get_batch()

int data_idx = (step_ * batch_size_) % num_steps_

This code limits the range of data_idx to between 0 and num_steps_, but it should be between 0 and 60000 (10000, test), so it only needs to be modified this way

int data_idx = step_ % num_steps_ * batch_size_;

After this modification, here is the result running on my machine:

[INFERENCE]
loading ./dataset/t10k-images.idx3-ubyte
loaded 10000 items..
conv1: Available Algorithm Count [FWD]: 10
conv1: Available Algorithm Count [BWD-filter]: 9
conv1: Available Algorithm Count [BWD-data]: 8
conv2: Available Algorithm Count [FWD]: 10
conv2: Available Algorithm Count [BWD-filter]: 9
conv2: Available Algorithm Count [BWD-data]: 8
loss: 0.145, accuracy: 90.050%
Done.

What is these identifier? compile error says that's undefined.

Hi.

Thanks for your code first. I can run cudnn/cuda deep learning example code successfully with yours.
By the way, I have question. When I make convolution using Makefile, the following error is occurred.
What can I do? What is the meaning of these identifiers?

/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -g -std=c++11 -G --resource-usage -Xcompiler -rdynamic -Xcompiler -fopenmp -rdc=true -lnvToolsExt -I/usr/local/cuda/samples/common/inc -I/usr/local/cuda/include -L/usr/local/cuda/lib64 -lcublas -lcudnn -lgomp -lcurand -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -o convolution convolution.cu
convolution.cu(82): error: identifier "CUDNN_CONVOLUTION_FWD_PREFER_FASTEST" is undefined
convolution.cu(82): error: identifier "cudnnGetConvolutionForwardAlgorithm" is undefined
convolution.cu(87): error: identifier "CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST" is undefined
convolution.cu(87): error: identifier "cudnnGetConvolutionBackwardFilterAlgorithm" is undefined
convolution.cu(92): error: identifier "CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST" is undefined
convolution.cu(92): error: identifier "cudnnGetConvolutionBackwardDataAlgorithm" is undefined
6 errors detected in the compilation of "convolution.cu".

Compile error on cuDNN v8

When compiling on cuDNN v8.7, the following compilation error occurs.

convolution.cu(82): error: identifier "CUDNN_CONVOLUTION_FWD_PREFER_FASTEST" is undefined

convolution.cu(82): error: identifier "cudnnGetConvolutionForwardAlgorithm" is undefined

convolution.cu(87): error: identifier "CUDNN_CONVOLUTION_BWD_FILTER_PREFER_FASTEST" is undefined

convolution.cu(87): error: identifier "cudnnGetConvolutionBackwardFilterAlgorithm" is undefined

convolution.cu(92): error: identifier "CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST" is undefined

convolution.cu(92): error: identifier "cudnnGetConvolutionBackwardDataAlgorithm" is undefined

6 errors detected in the compilation of "convolution.cu".
make: *** [Makefile:43: convolution] Error 1

Facing errors wrt g++ version type

The version that I'm currently running on are as follows:
CUDA - 10.0
OS - Linux-x86_64
cudnn-10.2-linux-x64-v7.6.5.32.tgz
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)

I tried to download and run the code as mentioned in the README file. But I faced the following errors:

/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/samples/common/inc -I/usr/local/cuda/include -m64 -g -std=c++11 -G --resource-usage -Xcompiler -rdynamic -Xcompiler -fopenmp -rdc=true -lnvToolsExt -I/usr/local/cuda/samples/common/inc -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcublas -lcudnn -lgomp -lcurand -gencode arch=compute_50,code=sm_50 -c train.cpp -o obj/train.o
nvcc warning : Resource usage is not shown as the final resource allocation is not done.
nvcc warning : The -c++11 flag is not supported with the configured host compiler. Flag will be ignored.
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/array:35,
from src/mnist.h:6,
from train.cpp:1:
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/c++0x_warning.h:31:2: error: #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options.
In file included from src/mnist.h:13,
from train.cpp:1:
src/blob.h:45: error: expected ‘)’ before ‘<’ token
src/blob.h:97: error: ‘std::array’ has not been declared
src/blob.h:97: error: expected ‘,’ or ‘...’ before ‘<’ token
src/blob.h:103: error: ISO C++ forbids declaration of ‘array’ with no type
src/blob.h:103: error: invalid use of ‘::’
src/blob.h:103: error: expected ‘;’ before ‘<’ token
train.cpp:149: error: expected ‘;’ at end of input
train.cpp:149: error: expected ‘}’ at end of input
In file included from src/mnist.h:13,
from train.cpp:1:
src/blob.h: In destructor ‘cudl::Blob::~Blob()’:
src/blob.h:60: error: ‘is_tensor_’ was not declared in this scope
src/blob.h:61: error: ‘tensor_desc_’ was not declared in this scope
src/blob.h: In member function ‘void cudl::Blob::reset(int, int, int, int)’:
src/blob.h:87: error: ‘cudl::cuda’ cannot be used as a function
src/blob.h:90: error: ‘is_tensor_’ was not declared in this scope
src/blob.h:92: error: ‘tensor_desc_’ was not declared in this scope
src/blob.h: In member function ‘void cudl::Blob::reset(int)’:
src/blob.h:99: error: ‘size’ was not declared in this scope
src/blob.h: At global scope:
src/blob.h:100: error: expected unqualified-id at end of input
src/blob.h:100: error: expected ‘}’ at end of input
make: *** [obj/train.o] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.