Git Product home page Git Product logo

binarynet's People

Contributors

bryant1410 avatar itayhubara avatar ken012git avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

binarynet's Issues

Comparison with standard model

Any suggestions on how to compare the training time with a non-binary MLP of same complexity? i.e. maybe a set of opt flags that I need to turn off to convert the Main_BinaryNet_MNIST.py into a standard MLP model?
Also, I'm getting a test accuracy of only ~95%(~10 epochs) on MNIST. What maximum accuracy can I expect on full training? How many epochs will be needed?

why we cannot use gpu 0?

'-devid 1' can run normally

rzai@rzai00:/prj/BinaryNet$ th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model -devid 0 -batchSize 150 -epoch 500 2>&1 | tee yknote---log
sh: 1: mk1ir: not found
0
14033566
14033566
14033566
[program started on Wed Oct 26 16:26:15 2016]
[command line arguments]
stcWeights false
LR 0.015625
modelsFolder ./Models/
batchSize 150
optimization adam
preProcDir /home/rzai/prj/BinaryNet/PreProcData/Cifar10
network ./Models/BinaryNet_Cifar10_Model
stcNeurons true
constBatchSize false
LRDecay 0
whiten true
augment false
load
nGPU 1
dp_prepro false
format rgb
save /home/rzai/prj/BinaryNet/Results/WedOct2616:26:082016
dataset Cifar10
normalization simple
devid 0
visualize 1
type cuda
threads 8
SBN true
momentum 0
weightDecay 0
runningVal false
epoch 500
[----------------------]
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-6130/cutorch/init.c line=719 error=10 : invalid device ordinal
/home/rzai/torch/install/bin/luajit: Main_BinaryNet_Cifar10.lua:115: cuda runtime error (10) : invalid device ordinal at /tmp/luarocks_cutorch-scm-1-6130/cutorch/init.c:719
stack traceback:
[C]: in function 'setDevice'
Main_BinaryNet_Cifar10.lua:115: in main chunk
[C]: in function 'dofile'
...rzai/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
rzai@rzai00:
/prj/BinaryNet$
rzai@rzai00:/prj/BinaryNet$
rzai@rzai00:
/prj/BinaryNet$

read error:

envy@ub1404:/os_pri/github/BinaryNet$ th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model
sh: 1: mk1ir: not found
0
14033566
14033566
14033566
/home/envy/torch/install/bin/luajit: /home/envy/torch/install/share/lua/5.1/trepl/init.lua:384: /home/envy/torch/install/share/lua/5.1/torch/File.lua:351: read error: read 263456 blocks instead of 30720000 at /tmp/luarocks_torch-scm-1-2722/torch7/lib/TH/THDiskFile.c:320
stack traceback:
[C]: in function 'error'
/home/envy/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
Main_BinaryNet_Cifar10.lua:84: in main chunk
[C]: in function 'dofile'
...envy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
envy@ub1404:
/os_pri/github/BinaryNet$

Troubleshoot in running your code

Hi Itay,

I run your example code (after some database debug), and I get

/home/ehoffer/torch/install/bin/luajit: Main_BinaryNet_Cifar10.lua:216: Invalid index in scatter at /tmp/luarocks_torch-scm-1-1732/torch7/lib/TH/generic/THTensorMath.c:370
stack traceback:
    [C]: in function 'scatter'
    Main_BinaryNet_Cifar10.lua:216: in function 'Train'
    Main_BinaryNet_Cifar10.lua:270: in main chunk
    [C]: in function 'dofile'
    ...ffer/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

Do you have any suggestions?

Lior

About the Cifar10 case

Hi,

I am trying to run the CIFAR-10 case, but it gives me the following error:

/home/xx/Projects/torch/install/bin/luajit: ...e/xx/Projects/torch/install/share/lua/5.1/trepl/init.lua:384: ...ojects/torch/install/share/lua/5.1/dp/preprocess/zca.lua:44: invalid arguments: DoubleTensor FloatTensor
expected arguments: DoubleTensor [DoubleTensor] double | DoubleTensor [DoubleTensor] [double] DoubleTensor
stack traceback:
[C]: in function 'error'
...e/xx/Projects/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
Main_BinaryNet_Cifar10.lua:84: in main chunk
[C]: in function 'dofile'
...ects/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

When I run the MNIST case, there is no such problem.

Does anyone know how to solve this problem? Thanks in advance!

BinaryNet in TensorFlow ?...

Hi Itay,

I read the BinaryNet paper and it seems very promising and suggests huge speed/power gain.
I am currently using TensorFlow framework and Im not familiar with Torch at all.
I was wondering if this great work can be converted to TF framework?
is it possible to implement it in TF? Have you maybe done this?

Many thanks!
Yonathan

limited number of epochs

Main_BinaryNet_MNIST.lua runs only for 3 epochs, regardless of the number of epochs i specify in line 32
cmd:option('-epoch', -1, 'number of epochs to train, -1 for unbounded')

everytime it achieves a test accuracy of 94-95% and stops.

bad argument #4 to '?' (cannot convert 'int *' to 'int')

andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$ th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model
sh: 1: mk1ir: not found
Found Environment variable CUDNN_PATH = /home/andy1028/Downloads/cuda/lib64/libcudnn.so.5
0
14033566
14033566
14033566
[program started on Thu Oct 13 15:55:34 2016]
[command line arguments]
stcWeights false
LR 0.015625
modelsFolder ./Models/
batchSize 200
optimization adam
preProcDir /media/andy1028/data1t/os_prj/github/BinaryNet/PreProcData/Cifar10
network ./Models/BinaryNet_Cifar10_Model
stcNeurons true
constBatchSize false
LRDecay 0
whiten true
augment false
load
nGPU 1
dp_prepro false
format rgb
save /media/andy1028/data1t/os_prj/github/BinaryNet/Results/ThuOct1315:54:402016
dataset Cifar10
normalization simple
devid 1
visualize 1
type cuda
threads 8
SBN true
momentum 0
weightDecay 0
runningVal false
epoch -1
[----------------------]
==> Network
nn.Sequential {
input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> output: cudnnBinarySpatialConvolution(3 -> 128, 3x3, 1,1, 1,1)
(2): SpatialBatchNormalizationShiftPow2
(3): nn.HardTanh
(4): BinarizedNeurons
(5): cudnnBinarySpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
(6): cudnn.SpatialMaxPooling(2x2, 2,2)
(7): SpatialBatchNormalizationShiftPow2
(8): nn.HardTanh
(9): BinarizedNeurons
(10): cudnnBinarySpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
(11): SpatialBatchNormalizationShiftPow2
(12): nn.HardTanh
(13): BinarizedNeurons
(14): cudnnBinarySpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(15): cudnn.SpatialMaxPooling(2x2, 2,2)
(16): SpatialBatchNormalizationShiftPow2
(17): nn.HardTanh
(18): BinarizedNeurons
(19): cudnnBinarySpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
(20): SpatialBatchNormalizationShiftPow2
(21): nn.HardTanh
(22): BinarizedNeurons
(23): cudnnBinarySpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(24): cudnn.SpatialMaxPooling(2x2, 2,2)
(25): SpatialBatchNormalizationShiftPow2
(26): nn.HardTanh
(27): BinarizedNeurons
(28): nn.View(8192)
(29): BinaryLinear(8192 -> 1024)
(30): BatchNormalizationShiftPow2
(31): nn.HardTanh
(32): BinarizedNeurons
(33): BinaryLinear(1024 -> 1024)
(34): BatchNormalizationShiftPow2
(35): nn.HardTanh
(36): BinarizedNeurons
(37): BinaryLinear(1024 -> 10)
(38): nn.BatchNormalization
}
==>14033566 Parameters
==> Loss
SqrtHingeEmbeddingCriterion

==> Starting Training

Epoch 1
/home/andy1028/torch/install/bin/luajit: /home/andy1028/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
/home/andy1028/torch/install/share/lua/5.1/cudnn/init.lua:115: bad argument #4 to '?' (cannot convert 'int *' to 'int')
stack traceback:
[builtin#173]: at 0x00455840
/home/andy1028/torch/install/share/lua/5.1/cudnn/init.lua:115: in function 'errcheck'
...ithub/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:71: in function 'resetWeightDescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
...ithub/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:146: in function <...ithub/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:142>
[C]: in function 'xpcall'
/home/andy1028/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
...e/andy1028/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Main_BinaryNet_Cifar10.lua:214: in function 'Train'
Main_BinaryNet_Cifar10.lua:270: in main chunk
[C]: in function 'dofile'
...1028/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405ea0

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
/home/andy1028/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
...e/andy1028/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Main_BinaryNet_Cifar10.lua:214: in function 'Train'
Main_BinaryNet_Cifar10.lua:270: in main chunk
[C]: in function 'dofile'
...1028/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405ea0
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$ git remote -v
origin https://github.com/itayhubara/BinaryNet.git (fetch)
origin https://github.com/itayhubara/BinaryNet.git (push)
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/BinaryNet$

out of memory at /tmp/luarocks_cutorch-scm-1-7471/cutorch

envy@ub1404:/os_pri/github/BinaryNet$ th Main_BinaryNet_MNIST.lua -network BinaryNet_MNIST_Model
sh: 1: mk1ir: not found
0
10033182
10033182
[program started on Wed May 25 14:58:10 2016]
[command line arguments]
stcWeights false
LR 0.015625
modelsFolder ./Models/
batchSize 100
optimization adam
preProcDir /home/envy/os_pri/github/BinaryNet/PreProcData/MNIST
network ./Models/BinaryNet_MNIST_Model
stcNeurons true
constBatchSize false
LRDecay 0
whiten false
augment false
load
nGPU 1
dp_prepro false
format rgb
save /home/envy/os_pri/github/BinaryNet/Results/WedMay2514:58:082016
dataset MNIST
normalization simple
devid 1
visualize 1
type cuda
threads 8
SBN true
momentum 0
weightDecay 0
runningVal true
epoch -1
[----------------------]
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-7471/cutorch/lib/THC/generic/THCStorage.cu line=41 error=2 : out of memory
/home/envy/torch/install/bin/luajit: /home/envy/torch/install/share/lua/5.1/nn/utils.lua:11: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-7471/cutorch/lib/THC/generic/THCStorage.cu:41
stack traceback:
[C]: in function 'resize'
/home/envy/torch/install/share/lua/5.1/nn/utils.lua:11: in function 'torch_Storage_type'
/home/envy/torch/install/share/lua/5.1/nn/utils.lua:57: in function 'recursiveType'
/home/envy/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'type'
/home/envy/torch/install/share/lua/5.1/nn/utils.lua:45: in function 'recursiveType'
/home/envy/torch/install/share/lua/5.1/nn/utils.lua:41: in function 'recursiveType'
/home/envy/torch/install/share/lua/5.1/nn/Module.lua:126: in function 'cuda'
Main_BinaryNet_MNIST.lua:118: in main chunk
[C]: in function 'dofile'
...envy/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
envy@ub1404:
/os_pri/github/BinaryNet$

several errors when I run with "th Main_BinaryNet_MNIST.lua -network BinaryNet_MNIST_Model"

#1 sh: 1: mk1ir: not found
#2 Epoch 1
THCudaCheck FAIL file=/home/lch/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c line=81 error=77 : an illegal memory access was encountered
/home/lch/torch/install/bin/luajit: cuda runtime error (77) : an illegal memory access was encountered at /home/lch/torch/extra/cutorch/lib/THC/generic/THCStorage.c:182

I don't figure out the solution to these errors. Thanks for your help!

Unable to install dependencies

Hi, I am very new to torch and i am trying to get this network up and running. However, when i try to install the dependencies, i get an error.

$ luarocks install https://raw.githubusercontent.com/eladhoffer/DataProvider.torch/master/dataprovider-scm-1.rockspec


CMake Error at CMakeLists.txt:8 (FIND_PACKAGE)
By not providing "FindTorch.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "Torch", but CMake could not find one.

Could not find a package configuration file provided by "Torch" with any of the following names:

TorchConfig.cmake
torch-config.cmake

Add the installation prefix of "Torch" to CMAKE_PREFIX_PATH or set "Torch_DIR" to a directory containing one of the above files.

the other dependencies such as dp and unsup just say 'Error: No results matching query were found.'

should i first be cloning these 'dependencies' somewhere in my torch directory before building torch?

An error in BinaryNet_SVHN_Model.lua

I attempted to execute the BinaryNet_SVHN.lua but got an error "bad argument #4 to '?' (cannot convert 'int *' to 'int')" as follows. Re-install of nn, cutorch, cunn and cudnn did not work. Do you have any suggestions?
Thank you in advance for your cooperation.

BinaryNet user$ th Main_BinaryNet_SVHN.lua -network BinaryNet_SVHN_Model
sh: mk1ir: command not found
0
6406494
6397632
6406494
/Users/user/BinaryNet/PreProcData/SVHN/rgbGCN_LCN_valid.t7
applying GCN preprocessing
[================== 598388/598388 ============>]ETA: 0ms | Step: 0ms
applying LeCunLCN preprocessing
[================== 598388/598388 ============>]ETA: 0ms | Step: 0ms
applying GCN preprocessing
[================== 6000/6000 ================>]ETA: 0ms | Step: 0ms
applying LeCunLCN preprocessing
[================== 6000/6000 ================>]ETA: 0ms | Step: 0ms
applying GCN preprocessing
[================== 26032/26032 ==============>]ETA: 0ms | Step: 0ms
applying LeCunLCN preprocessing
[================== 26032/26032 ==============>]ETA: 0ms | Step: 0ms
[program started on Fri Jun 17 05:39:50 2016]
[command line arguments]
stcWeights false
LR 0.0078125
modelsFolder ./Models/
batchSize 200
optimization adam
preProcDir /Users/user/BinaryNet/PreProcData/SVHN
network ./Models/BinaryNet_SVHN_Model
stcNeurons true
constBatchSize false
LRDecay 0
whiten false
augment false
load
nGPU 1
dp_prepro true
format rgb
save /Users/user/BinaryNet/Results/FriJun1705:33:592016
dataset SVHN
normalization simple
devid 1
visualize 1
type cuda
threads 8
SBN true
momentum 0
weightDecay 0
runningVal true
epoch -1
[----------------------]
==> Network
nn.Sequential {
input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> output: cudnnBinarySpatialConvolution(3 -> 64, 3x3, 1,1, 1,1)
(2): SpatialBatchNormalizationShiftPow2
(3): nn.HardTanh
(4): BinarizedNeurons
(5): cudnnBinarySpatialConvolution(64 -> 64, 3x3, 1,1, 1,1)
(6): cudnn.SpatialMaxPooling(2x2, 2,2)
(7): SpatialBatchNormalizationShiftPow2
(8): nn.HardTanh
(9): BinarizedNeurons
(10): cudnnBinarySpatialConvolution(64 -> 128, 3x3, 1,1, 1,1)
(11): SpatialBatchNormalizationShiftPow2
(12): nn.HardTanh
(13): BinarizedNeurons
(14): cudnnBinarySpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
(15): cudnn.SpatialMaxPooling(2x2, 2,2)
(16): SpatialBatchNormalizationShiftPow2
(17): nn.HardTanh
(18): BinarizedNeurons
(19): cudnnBinarySpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
(20): SpatialBatchNormalizationShiftPow2
(21): nn.HardTanh
(22): BinarizedNeurons
(23): cudnnBinarySpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(24): cudnn.SpatialMaxPooling(2x2, 2,2)
(25): SpatialBatchNormalizationShiftPow2
(26): nn.HardTanh
(27): BinarizedNeurons
(28): nn.View(4096)
(29): BinaryLinear(4096 -> 1024)
(30): BatchNormalizationShiftPow2
(31): nn.HardTanh
(32): BinarizedNeurons
(33): BinaryLinear(1024 -> 1024)
(34): BatchNormalizationShiftPow2
(35): nn.HardTanh
(36): BinarizedNeurons
(37): BinaryLinear(1024 -> 10)
(38): nn.BatchNormalization
}
==>6406494 Parameters
==> Loss
SqrtHingeEmbeddingCriterion

==> Starting Training

Epoch 1
/Users/user/torch/install/bin/luajit: /Users/user/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
/Users/user/torch/install/share/lua/5.1/cudnn/init.lua:55: bad argument #4 to '?' (cannot convert 'int *' to 'int')
stack traceback:
[builtin#173]: at 0x0107d55450
/Users/user/torch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck'
.../user/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:71: in function 'resetWeightDescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function 'updateOutput'
.../user/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:146: in function <.../user/BinaryNet/Models/cudnnBinarySpatialConvolution.lua:142>
[C]: in function 'xpcall'
/Users/user/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/Users/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Main_BinaryNet_SVHN.lua:215: in function 'Train'
Main_BinaryNet_SVHN.lua:279: in main chunk
[C]: in function 'dofile'
...user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x0107cd61a0

WARNING: If you see a stack trace below, it doesn't point to the place where this error occured. Please use only the one above.
stack traceback:
[C]: in function 'error'
/Users/user/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/Users/user/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
Main_BinaryNet_SVHN.lua:215: in function 'Train'
Main_BinaryNet_SVHN.lua:279: in main chunk
[C]: in function 'dofile'
...user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x0107cd61a0

An error in BinaryNet_Cifar10_Model.lua

I attempted to excute the Main_BinaryNet_Cifar10.lua. But I got some errors and I do not know how to solve them. When I run the MNIST case, it runs well. SVHN case hase the same errors. Could you please give me some suggestions? Thank you in advance for your cooperation.

[liubaicheng@mesic-dlcad BinaryNet-master_22]$ th Main_BinaryNet_Cifar10.lua -network BinaryNet_Cifar10_Model
0
14033566
14033566
14033566
[program started on Mon Sep 10 15:26:22 2018]
[command line arguments]
devid 1
threads 8
network ./Models/BinaryNet_Cifar10_Model
weightDecay 0
optimization adam
nGPU 1
runningVal false
LR 0.015625
SBN true
augment false
epoch -1
stcWeights false
stcNeurons true
dataset Cifar10
LRDecay 0
batchSize 200
normalization simple
load
dp_prepro false
whiten true
type cuda
momentum 0
modelsFolder ./Models/
constBatchSize false
preProcDir /home/liubaicheng/try_2018_4_26/BinaryNet-master_22/PreProcData/Cifar10
save /home/liubaicheng/try_2018_4_26/BinaryNet-master_22/Results/MonSep1015:25:542018
format rgb
visualize 1
[----------------------]
==> Network
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> output]
(1): cudnnBinarySpatialConvolution(3 -> 128, 3x3, 1,1, 1,1)
(2): SpatialBatchNormalizationShiftPow2
(3): nn.HardTanh
(4): BinarizedNeurons
(5): cudnnBinarySpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
(6): cudnn.SpatialMaxPooling(2x2, 2,2)
(7): SpatialBatchNormalizationShiftPow2
(8): nn.HardTanh
(9): BinarizedNeurons
(10): cudnnBinarySpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
(11): SpatialBatchNormalizationShiftPow2
(12): nn.HardTanh
(13): BinarizedNeurons
(14): cudnnBinarySpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(15): cudnn.SpatialMaxPooling(2x2, 2,2)
(16): SpatialBatchNormalizationShiftPow2
(17): nn.HardTanh
(18): BinarizedNeurons
(19): cudnnBinarySpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
(20): SpatialBatchNormalizationShiftPow2
(21): nn.HardTanh
(22): BinarizedNeurons
(23): cudnnBinarySpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(24): cudnn.SpatialMaxPooling(2x2, 2,2)
(25): SpatialBatchNormalizationShiftPow2
(26): nn.HardTanh
(27): BinarizedNeurons
(28): nn.View(8192)
(29): BinaryLinear(8192 -> 1024)
(30): BatchNormalizationShiftPow2
(31): nn.HardTanh
(32): BinarizedNeurons
(33): BinaryLinear(1024 -> 1024)
(34): BatchNormalizationShiftPow2
(35): nn.HardTanh
(36): BinarizedNeurons
(37): BinaryLinear(1024 -> 10)
(38): nn.BatchNormalization (2D) (10)
}
==>14033566 Parameters
==> Loss
SqrtHingeEmbeddingCriterion

==> Starting Training

Epoch 1
/home/liubaicheng/torch/install/bin/lua: ...liubaicheng/torch/install/share/lua/5.2/nn/Container.lua:67:
In 1 module of nn.Sequential:
...e/liubaicheng/torch/install/share/lua/5.2/cudnn/init.lua:155: too few arguments
stack traceback:
[C]: in function '?'
...e/liubaicheng/torch/install/share/lua/5.2/cudnn/init.lua:155: in function 'call'
...e/liubaicheng/torch/install/share/lua/5.2/cudnn/init.lua:159: in function 'errcheck'
...ryNet-master_22/Models/cudnnBinarySpatialConvolution.lua:71: in function 'resetWeightDescriptors'
...torch/install/share/lua/5.2/cudnn/SpatialConvolution.lua:96: in function 'checkInputChanged'
...torch/install/share/lua/5.2/cudnn/SpatialConvolution.lua:120: in function 'createIODescriptors'
...ryNet-master_22/Models/cudnnBinarySpatialConvolution.lua:122: in function 'createIODescriptors'
...torch/install/share/lua/5.2/cudnn/SpatialConvolution.lua:188: in function 'updateOutput'
...ryNet-master_22/Models/cudnnBinarySpatialConvolution.lua:146: in function <...ryNet-master_22/Models/cudnnBinarySpatialConvolution.lua:142>
[C]: in function 'xpcall'
...liubaicheng/torch/install/share/lua/5.2/nn/Container.lua:63: in function 'rethrowErrors'
...iubaicheng/torch/install/share/lua/5.2/nn/Sequential.lua:44: in function <...iubaicheng/torch/install/share/lua/5.2/nn/Sequential.lua:41>
(...tail calls...)
Main_BinaryNet_Cifar10.lua:214: in function <Main_BinaryNet_Cifar10.lua:186>
(...tail calls...)
Main_BinaryNet_Cifar10.lua:270: in main chunk
[C]: in function 'dofile'
...heng/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...liubaicheng/torch/install/share/lua/5.2/nn/Container.lua:67: in function 'rethrowErrors'
...iubaicheng/torch/install/share/lua/5.2/nn/Sequential.lua:44: in function <...iubaicheng/torch/install/share/lua/5.2/nn/Sequential.lua:41>
(...tail calls...)
Main_BinaryNet_Cifar10.lua:214: in function <Main_BinaryNet_Cifar10.lua:186>
(...tail calls...)
Main_BinaryNet_Cifar10.lua:270: in main chunk
[C]: in function 'dofile'
...heng/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

About Multi-GPU Usage

Hi,

I am trying to run the binarized MNIST case with only one GPU, since if I use all three GPUs on my server it might cause a "GPU lost" problem after training a while...(haven't figured out why yet :(..)

However, though the default GPU setting is already for only one single GPU, I can still see from nvidia-smi that all of the three GPUs are used...

I also tried to set -nGPU to 2, but still the same result...

Could anyone help?

Stop before performing the set epoch

Hello.

I read the BinaryNet paper and I tried to run the code.
However, when I do the simulation, it is not performed enough epoch.
Sometimes simulations end in epoch 16, 7 or 21, so that not enough accuracy is obtained.
I change the "opt.epoch" from -1 to 100, 200 or 50, but the epoch at which the simulation ends is always different.
Have you ever seen this situation?!
Could you give me some advice on what to fix?!

Thank you.
Jeong.

Main_BinaryNet_Cifar10.lua Line 84 local data = require 'Data' does not return

This line created the data files in the PreProcData/Cifar10/ subdirectory

553MB rgbwhiten_train.t7
123MB rgbwhiten_test.t7
61MB rgbwhiten_valid.t7

only these three files were created, and then the program hanged on this line forever (hours).

So far I traced it to the call to unsup.pcacov() in Data.lua. This is where it stopped to progress.

I tried to send messages using io.stderr:write() before and after this line of code. The message before appeared on the screen but the message after never showed up.

Any ideas on how to fix it? Thank you!

Out of memory

Hello,
I saw in the issues sections someone else had the same problem and I tried the referenced responses but I still get the same error. My GPU is GeForce 425m, I know it's an old GPU but is there anyway I can make this work? Can I use my CPU? Any suggestion is appreciated.

azadeh@azadeh:~/Downloads/BinaryNet$ th Main_BinaryNet_MNIST.lua -network BinaryNet_MNIST_Model
sh: 1: mk1ir: not found
0
10033182
10033182
[program started on Sat May 13 16:05:46 2017]
[command line arguments]
stcWeights false
LR 0.015625
modelsFolder ./Models/
batchSize 100
optimization adam
preProcDir /home/azadeh/Downloads/BinaryNet/PreProcData/MNIST
network ./Models/BinaryNet_MNIST_Model
stcNeurons true
constBatchSize false
LRDecay 0
whiten false
augment false
load
nGPU 1
dp_prepro false
format rgb
save /home/azadeh/Downloads/BinaryNet/Results/SatMay1316:05:392017
dataset MNIST
normalization simple
devid 1
visualize 1
type cuda
threads 8
SBN true
momentum 0
weightDecay 0
runningVal true
epoch -1
[----------------------]
==> Network
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> output]
(1): nn.View(-1, 784)
(2): BinaryLinear(784 -> 2048)
(3): BatchNormalizationShiftPow2
(4): nn.HardTanh
(5): BinarizedNeurons
(6): BinaryLinear(2048 -> 2048)
(7): BatchNormalizationShiftPow2
(8): nn.HardTanh
(9): BinarizedNeurons
(10): BinaryLinear(2048 -> 2048)
(11): BatchNormalizationShiftPow2
(12): nn.HardTanh
(13): BinarizedNeurons
(14): BinaryLinear(2048 -> 10)
(15): nn.BatchNormalization (2D) (10)
}
==>10033182 Parameters
==> Loss
SqrtHingeEmbeddingCriterion

==> Starting Training

==> Starting Training

Epoch 1
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-798/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/azadeh/torch/install/bin/luajit: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-798/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: at 0x7f2c9df84f90
[C]: in function '__index'
./adaMax_binary_clip_shift.lua:71: in function 'adaMax_binary_clip_shift'
Main_BinaryNet_MNIST.lua:233: in function 'Train'
Main_BinaryNet_MNIST.lua:286: in main chunk
[C]: in function 'dofile'
...adeh/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

QNN update

Hi Itay, thanks a lot to you and your team for this research effort and sharing the code. I've just read the new publication on QNN and was wondering when you are planning to release a new version of the code.

error when run "luarocks install cudnn"

dear itayhubara,
when i run the command "luarocks install cudnn" it occurs an error:

Installing https://raw.githubusercontent.com/torch/rocks/master/cunn-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/cunn-scm-1.rockspec... switching to 'build' mode

Missing dependencies for cunn:
cutorch >= 1.0

Using https://raw.githubusercontent.com/torch/rocks/master/cutorch-scm-1.rockspec... switching to 'build' mode
Cloning into 'cutorch'...
remote: Enumerating objects: 229, done.
remote: Counting objects: 100% (229/229), done.
remote: Compressing objects: 100% (184/184), done.
remote: Total 229 (delta 62), reused 90 (delta 43), pack-reused 0
Receiving objects: 100% (229/229), 241.83 KiB | 549.00 KiB/s, done.
Resolving deltas: 100% (62/62), done.
Warning: unmatched variable LUALIB

jopts=$(getconf _NPROCESSORS_CONF)

echo "Building on $jopts cores"
cmake -E make_directory build && cd build && cmake .. -DLUALIB= -DLUA_INCDIR=/home/ivanwu0404/torch/install/include -DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS} -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/ivanwu0404/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/ivanwu0404/torch/install/lib/luarocks/rocks/cutorch/scm-1" && make -j$jopts install

Building on 16 cores
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Torch7 in /home/ivanwu0404/torch/install
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "10.2", minimum required is "6.5")
-- Removing -DNDEBUG from compile flags
-- TH_LIBRARIES: TH
-- MAGMA not found. Compiling without MAGMA support
-- Autodetected CUDA architecture(s): 7.5
-- got cuda version 10.2
-- Found CUDA with FP16 support, compiling with torch.CudaHalfTensor
-- CUDA_NVCC_FLAGS: -D__CUDA_NO_HALF_OPERATORS__;-gencode;arch=compute_75,code=sm_75;-DCUDA_HAS_FP16=1
-- THC_SO_VERSION: 0
-- Performing Test HAS_LUAL_SETFUNCS
-- Performing Test HAS_LUAL_SETFUNCS - Success
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_cublas_device_LIBRARY (ADVANCED)
linked by target "THC" in directory /tmp/luarocks_cutorch-scm-1-2427/cutorch/lib/THC

-- Configuring incomplete, errors occurred!
See also "/tmp/luarocks_cutorch-scm-1-2427/cutorch/build/CMakeFiles/CMakeOutput.log".
See also "/tmp/luarocks_cutorch-scm-1-2427/cutorch/build/CMakeFiles/CMakeError.log".

Error: Failed installing dependency: https://raw.githubusercontent.com/torch/rocks/master/cutorch-scm-1.rockspec - Build error: Failed building.

i know one solution to problem is installing cuda toolkit, and i already installed it, but there still have same error.
Can you help me fixing it?
My cuda version is 10.2
i'm using ubuntu 18.04

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.