Git Product home page Git Product logo

caffe-parallel's Introduction

caffe* parallel

Overview

caffe* parallel is a faster framework for deep learning, it's forked from BVLC/caffe.(https://github.com/BVLC/caffe ,more details please visit http://caffe.berkeleyvision.org).The main achievement of this project is data-parallel via MPI.

The source for this version of caffe* parallel can be downloaded from: https://github.com/sailorsb/caffe-parallel

this version does not support matlab.

Author

Shen,Bo (Inspur) [email protected] ; Wang,Yajuan (Inspur) [email protected]

Changelog:

Ver 0.2(20150109):
Support LMDB now.(tested mnist)
Fixed some bugs.

Ver 0.1(20141231):

create project(forked from BVLC/caffe 20141223). Data-parallel on levelDB.(Only test cifar10) It's only a simple simple version.We'll as soon as possible to improve it and happy new year!

TODO List:

1.support LMDB

2.performance optimization

3.large-scale test

4.support Intel® Xeon Phi™Coprocessors

How to run it

1.Prerequisites

Caffe depends on several software packages.

CUDA library version(we used 6.0) 6.5, 6.0, 5.5, or 5.0 and the latest driver version for CUDA 6 or 319.* for CUDA 5 (and NOT 331.*)  
BLAS (we used MKL(14.0.2.144)/ OpenBLAS(r0.2.12))(provided via ATLAS, MKL, or OpenBLAS).  
OpenCV (we used 2.4.9)(need cmake >=2.8)  
Boost (we used 1.55)(>= 1.55, although only 1.55 and 1.56 are tested)  
glog (we used 0.33)  
gflags (we used 2.1.1)  
protobuf (we used 2.5.0)  
protobuf-c  
leveldb (we used 1.15.0)  
snappy (we used 1.1.2)  
hdf5 (we used 1.8.10)  
lmdb   
autoconf(>= 2.4)  
Compiler:  
    g++ compiler(we used 4.4.7)  
MPI compiler and runtime:  
    Intel MPI (we used 14.0.2.144) / MPICH3 (we used 3.1,CC=gcc,CXX=g++,--enable-threads=multiple)
For the Python wrapper  
    Python 2.7, numpy (>= 1.7), boost-provided boost.python  

cuDNN Caffe: for fastest operation Caffe is accelerated by drop-in integration of NVIDIA cuDNN. To speed up your Caffe models, install cuDNN then uncomment the USE_CUDNN := 1 flag in Makefile.config when installing Caffe. Acceleration is automatic.

CPU-only Caffe: for cold-brewed CPU-only Caffe uncomment the CPU_ONLY := 1 flag in Makefile.config to configure and build Caffe without CUDA. This is helpful for cloud or cluster deployment.

2.Compile

a. Copy Makefile.config.example and rename Makefile.config
b. edit Makefile.config:
i. If you compile with NVIDIA cuDNN acceleration, you should uncomment the USE_CUDNN := 1 flag switch in Makefile.config.

ii. If there is no GPU in your machine,you should switch tp CPU-only caffe by uncommenting the CPU_ONLY := 1 flag in Makefile.config.

iii. Uncomment CUSTOM_CXX flag and set it : CUSTOM_CXX := mpigxx . If you use Intel MPI, please set mpigxx, if you use MPICH3, please set mpicxx, if you use other MPI version ,please set the right mpixxx in Makefile.config! (Intel MPI,the default compiler is intel compiler; CUDA, should use GNU C++ compiler)
iv. Set BLAS: atlas for ATLAS ; mkl for MKL; open for OpenBlas
v. Set CUDA_DIR, BLAS_INCLUDE, BLAS_LIB, PYTHON_INCLUDE, PYTHON_LIB,
INCLUDE_DIRS, LIBRARY_DIRS if you need.
c. Modify Makefile:
i. Add -DMPICH_IGNORE_CXX_SEEK flag to COMMON_FLAGS in "# Debugging" :
COMMON_FLAGS += -DNDEBUG -O2 -DMPICH_IGNORE_CXX_SEEK
ii. Add -mt_mpi flag to CXXFLAGS in "# Complete build flags."(for Intel mpi)
iii. Add -mt_mpi flag to LINKFLAGS in "# Complete build flags."(for Intel mpi)
d. make it.

3.Run and Test

This program can run 2 processes at least.

cifar10

  1. Run data/cifar10/get_cifar10.sh to get cifar10 data.
  2. Run examples/cifar10/create_cifar10.sh to conver raw data to leveldb format.
  3. Run examples/cifar10/mpi_train_quick.sh to train the net. You can modify the
    "-n 16" to set new process number where 16 is the number of parallel processes,
    (if you use GPUs, the process number is m+1, m is GPU number)
    the "-host node11" is the node name in mpi_train_quick.sh script.

mnist

  1. Run data/mnist/get_mnist.sh to get mnist data.
  2. Run examples/mnist/create_mnist.sh to conver raw data to lmdb format.
  3. Run examples/mnist/mpi_train_lenet.sh to train the net. You can modify the "-n 16" to set new process number, the "-host node11" is the node name in mpi_train_quick.sh script.
    (if you use GPUs, the process number is m+1, m is GPU number)

Change from BVLC/caffe

  1. framework

a.used MPI to data-parallelism
b.each MPI process run one solve
c.training code is also mostly untouched
d.use a parameter server(thread),every solve compute each parameter , update to parameter server(PS) , PS compute and download new parameter to solve.
2. class / files

a.Solver/SGDSolver
b.data_layer/base_data_layer (parallel data read or distribute)
c.net (some interface and parameter update optimization)
d.other (include headfile, some interface, etc.)
Acknowledgements

The Caffe* parallel developers would like to thank
QiHoo(Zhang,Gang ; Dr.Hu,Jinhui)
Nvidia(Dr.Simon See ; Jessy Huan)
for algorithm support and Inspur for guidance during Caffe* parallel development.

caffe-parallel's People

Contributors

sailorsb avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.