nervanasystems / neon Goto Github PK

View Code? Open in Web Editor NEW

3.9K 328.0 810.0 8.79 MB

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

Home Page: http://neon.nervanasys.com/docs/latest

License: Apache License 2.0

Makefile 0.31% Python 54.36% Cuda 0.38% C 3.22% CSS 38.17% Perl 3.37% Shell 0.15% Batchfile 0.01% Dockerfile 0.04%

deep-learning python neon mkl performance fast neural-network

neon's People

Contributors

Stargazers

Watchers

Forkers

patricio272 kellyp aosmith lazymike grodrigues3 loganding cfandy cseparkman ospreyx twocngdagz kastnerkyle nguyen-toan neuroidss arjunmantri kaynewest coallaoh msbmunees ominux maximus0 javamonkey wangjunbao mobilefzb jethrotan kuozhang lgscofield hellufo2 zjucsxxd justintung chengjunjian olddays haosen stevenliuit fysoft2006 xubingyue yourchanges shanyechen skidu microinfo punyfeir huangpeng1126 hcxiong vanloswang jackerxff smallcattom michaelxin voidexception yinsenm tobeada wangshaohan1989 pchavanne virajshah yang-song logzhilong yangming ieswxia simengy txd866 huskyeder hson648 airy-ict ai-cdrone amyvmiwei lipengyu emmaggie jokame alongwithyou vireshbackup yliuhb joaquincorrea qinains ivanajw sehahn stevenberge kfoss azureplus tao2015 zebtech ml-lab jesselivezey qinhongwei janusnic nagyistoce tkim pedram7sd borromeotlhs jcoreyes davidbcheng rustam-e tmiyamon florinsch pkthebud codeaudit bgenchel rtvt123 youventures zhangyangang prabindh zhenglab eyrun ray2020

neon's Issues

Monitoring validation set accuracy during training

As I understand, currently metrics are only evaluated at the end of training session. Is there a way to monitor validation accuracy during training, so that I can cancel training session if results don't improve?

Similar question - are there any existing utilities to plot loss and accuracy over training session?

Unable to access "how to add a model" documentation

When I go to to https://sites.google.com/a/nervanasys.com/wiki/algorithms/neon/how-to-write-a-mylearn-model, I get a forbidden access (403) error. I can access the other links under documentation.

Speed check fails with nervanagpu because rectleaky isn't implemented

either rectleaky should be implemented here https://github.com/NervanaSystems/neon/blob/master/neon/backends/gpu.py

or the rectleaky transform shouldn't be included here
https://github.com/NervanaSystems/neon/blob/master/examples/convnet/synthetic-sanity_check.yaml

Make fails in nervanagpu: OS X 10.10.3 - sed: illegal option -- r

Makefile:

define list_includes
$(shell sed -rn 's/^<INCLUDE file="(.*)"/>/\1/p' $(call strip_codes,$(1)))
endef

David-Laxers-MacBook-Pro:nervanagpu davidlaxer$ sudo !!
sudo make all
The directory '/Users/davidlaxer/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
installing maxas...
/tmp/nervanagpu.XXXXXXXX.waaER91X
Cloning into 'maxas'...
remote: Counting objects: 170, done.
remote: Total 170 (delta 0), reused 0 (delta 0), pack-reused 170
Receiving objects: 100% (170/170), 163.15 KiB | 0 bytes/s, done.
Resolving deltas: 100% (67/67), done.
Checking connectivity... done.
Checking if your kit is complete...
Looks good
Warning: prerequisite Carp 1.29 not found. We have 1.26.
Warning: prerequisite Data::Dumper 2.145 not found. We have 2.13506.
Writing Makefile for MaxAs::MaxAs
Writing MYMETA.yml and MYMETA.json
cp lib/MaxAs/MaxAs.pm blib/lib/MaxAs/MaxAs.pm
cp lib/MaxAs/Cubin.pm blib/lib/MaxAs/Cubin.pm
cp lib/MaxAs/MaxAsGrammar.pm blib/lib/MaxAs/MaxAsGrammar.pm
cp bin/maxas.pl blib/script/maxas.pl
/opt/local/bin/perl5.16 -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/maxas.pl
Manifying blib/man3/MaxAs::MaxAs.3pm
Appending installation info to /opt/local/lib/perl5/5.16.3/darwin-thread-multi-2level/perllocal.pod
sed: illegal option -- r
usage: sed script [-Ealn] [-i extension] [file ...]
sed [-Ealn] [-i extension] [-e script] ... [-f script_file] ... [file ...]
building kernel: hgemm_nn_128x128...
make: maxas.pl: No such file or directory
make: *** [nervanagpu/kernels/cubin/hgemm_nn_128x128.cubin] Error 1
David-Laxers-MacBook-Pro:nervanagpu davidlaxer$ sed -r
sed: illegal option -- r
usage: sed script [-Ealn] [-i extension] [file ...]
sed [-Ealn] [-i extension] [-e script] ... [-f script_file] ... [file ...]

‘pythON’ should rename to 'Python'

'pythON' in the project description and README.rst should rename to 'Python'

Could I use branch "MGPUDev"?

I tried to checkout this branch, and run the example/convnet/mnist-small.yaml. But there are even indentation errors in the file:

WARNING:neon.util.persist:deserializing object from: mnist-small.yaml
2015-06-26 11:26:21,774 WARNING:neon - setting log level to: 20
Traceback (most recent call last):
File "/home/zxx/Install/python2.7.9/bin/neon", line 6, in
exec(compile(open(file).read(), file, 'exec'))
File "/home/zxx/neon_dev/bin/neon", line 236, in
experiment, result, status = main()
File "/home/zxx/neon_dev/bin/neon", line 198, in main
device_id=args.device_id)
File "/home/zxx/neon_dev/neon/backends/init.py", line 98, in gen_backend
from neon.backends.cc2 import GPU
File "/home/zxx/neon_dev/neon/backends/cc2.py", line 1367
def update_fc_bias(self, err, out):
^
IndentationError: unindent does not match any outer indentation level

Does that indicate the master is the only stable branch we could use?

Can't run mobydick-lstm-small.yaml example on GPU

I have a K40, and the install neon succeed.
I have tested the installation by run the mnist-small.yaml on GPU and here is the output. I think I have installed neon succeed.

(venv)↪  /home/gys/ocz/NervanaSys/neon/examples/convnet git:(master) ▸
↪  neon -g cudanet mnist-small.yaml
WARNING:neon.util.persist:deserializing object from:  mnist-small.yaml
2015-05-18 14:38:13,823 WARNING:neon - setting log level to: 20
2015-05-18 14:38:14,048 INFO:__init__ - Cudanet backend, RNG seed: None, numerr: None
2015-05-18 14:38:14,049 INFO:mlp - Layers:
        DataLayer d0: 784 nodes
        ConvLayer layer1: 1 x (28 x 28) inputs, 16 x (24 x 24) nodes, Linear act_fn
        PoolingLayer layer2: 16 x (24 x 24) inputs, 16 x (12 x 12) nodes, Linear act_fn
        ConvLayer layer3: 16 x (12 x 12) inputs, 32 x (8 x 8) nodes, Linear act_fn
        PoolingLayer layer4: 32 x (8 x 8) inputs, 32 x (4 x 4) nodes, Linear act_fn
        FCLayer layer5: 512 inputs, 500 nodes, RectLin act_fn
        FCLayer output: 500 inputs, 10 nodes, Logistic act_fn
        CostLayer cost: 10 nodes, CrossEntropy cost_fn

2015-05-18 14:38:14,051 INFO:val_init - Generating UniformValGen values of shape (25, 16)
2015-05-18 14:38:14,052 INFO:val_init - Generating UniformValGen values of shape (400, 32)
2015-05-18 14:38:14,053 INFO:val_init - Generating UniformValGen values of shape (500, 512)
2015-05-18 14:38:14,057 INFO:val_init - Generating UniformValGen values of shape (10, 500)
2015-05-18 14:38:14,059 INFO:mnist - loading: train-images-idx3-ubyte
2015-05-18 14:38:14,106 INFO:mnist - loading: train-labels-idx1-ubyte
2015-05-18 14:38:14,108 INFO:mnist - loading: t10k-images-idx3-ubyte
2015-05-18 14:38:14,117 INFO:mnist - loading: t10k-labels-idx1-ubyte
2015-05-18 14:38:14,127 WARNING:dataset - Incompatible batch size. Discarding 16 samples...
2015-05-18 14:38:14,158 WARNING:dataset - Incompatible batch size. Discarding 112 samples...
2015-05-18 14:38:14,176 WARNING:dataset - Incompatible batch size. Discarding 16 samples...
2015-05-18 14:38:14,179 WARNING:dataset - Incompatible batch size. Discarding 112 samples...
2015-05-18 14:38:14,182 INFO:mlp - commencing model fitting
2015-05-18 14:38:14,329 INFO:mlp - epoch: 0, training error: 2.67295
2015-05-18 14:38:14,477 INFO:mlp - epoch: 1, training error: 0.73958
2015-05-18 14:38:14,625 INFO:mlp - epoch: 2, training error: 0.44819
2015-05-18 14:38:14,773 INFO:mlp - epoch: 3, training error: 0.33237
2015-05-18 14:38:14,921 INFO:mlp - epoch: 4, training error: 0.25886
2015-05-18 14:38:14,967 INFO:fit_predict_err - test set MisclassPercentage_TOP_1 3.65585
2015-05-18 14:38:14,995 INFO:fit_predict_err - train set MisclassPercentage_TOP_1 2.30978

But when I run the LSTM example by

neon -g cudanet mobydick-lstm-small.yaml

the issues is

2015-05-18 14:41:31,891 DEBUG:cc2 - Copying to GPU
2015-05-18 14:41:31,949 INFO:mlp - Layers:
        DataLayer d0: 128 nodes
        RecurrentLSTMLayer recurrent: 128 inputs, 64 nodes, Tanh act_fn
        RecurrentOutputLayer output: 64 inputs, 128 nodes, Logistic act_fn
        Layer cost: 128 nodes, CrossEntropy cost_fn, utilizing GPU backend


2015-05-18 14:41:31,949 INFO:rnn - DataLayer d0: 128 nodes
2015-05-18 14:41:31,949 INFO:rnn - RecurrentLSTMLayer recurrent: 128 inputs, 64 nodes, Tanh act_fn
2015-05-18 14:41:31,949 INFO:rnn - RecurrentOutputLayer output: 64 inputs, 128 nodes, Logistic act_fn
2015-05-18 14:41:31,949 INFO:rnn - Layer cost: 128 nodes, CrossEntropy cost_fn, utilizing GPU backend

Traceback (most recent call last):
  File "/ocz/gys/NervanaSys/neon/venv/bin/neon", line 240, in <module>
    experiment, result, status = main()
  File "/ocz/gys/NervanaSys/neon/venv/bin/neon", line 208, in main
    result = experiment.run()
  File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit_predict_err.py", line 98, in run
    super(FitPredictErrorExperiment, self).run()
  File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit.py", line 101, in run
    self.model.fit(self.dataset)
  File "/usr/local/lib/python2.7/dist-packages/neon/models/rnn.py", line 117, in fit
    self.grad_checker(numgrad="output")
  File "/usr/local/lib/python2.7/dist-packages/neon/models/rnn.py", line 369, in grad_checker
    num_target=num_target, num_i=num_i, num_j=num_j)
  File "/usr/local/lib/python2.7/dist-packages/neon/models/rnn.py", line 216, in fprop
    num_target[num_i, num_j] = (numpy_target + eps)
  File "/usr/local/lib/python2.7/dist-packages/neon/backends/cc2.py", line 285, in __setitem__
    raise TooSlowToImplementError("arbitrary "
neon.util.error.TooSlowToImplementError: arbitrary indexing

if I run the lstm example on cpu, everything goes well.

Add Docker images

Installation requires several configuration options and dependencies, some of which must be built from separate repos. Encapsulating neon within a Docker image would help newcomers get started, provide an isolated environment, and enable deployment portability.

Support for Regression Problems

Where the output is real values (R^n) and the loss function is something like MSE.

cannot run "neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml"

i've installed nervanagpu sucessfully,when i run "neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml" mistakes happens as below:
dsp@dsp:~/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-06-15 22:35:55,300 WARNING:neon - setting log level to: 20
2015-06-15 22:35:55,385 INFO:gpu - Initialized NervanaGPU with stochastic_round=None
2015-06-15 22:35:55,385 INFO:gpu - Seeding random number generator with: None
2015-06-15 22:35:55,386 INFO:init - NervanaGPU backend, RNG seed: None, numerr: None
2015-06-15 22:35:55,386 INFO:mlp - Layers:
ImageDataLayer d0: 3 x (224 x 224) nodes
ConvLayer conv1: 3 x (224 x 224) inputs, 64 x (55 x 55) nodes, RectLin act_fn
PoolingLayer pool1: 64 x (55 x 55) inputs, 64 x (27 x 27) nodes, Linear act_fn
ConvLayer conv2: 64 x (27 x 27) inputs, 192 x (27 x 27) nodes, RectLin act_fn
PoolingLayer pool2: 192 x (27 x 27) inputs, 192 x (13 x 13) nodes, Linear act_fn
ConvLayer conv3: 192 x (13 x 13) inputs, 384 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv4: 384 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv5: 256 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
PoolingLayer pool3: 256 x (13 x 13) inputs, 256 x (6 x 6) nodes, Linear act_fn
FCLayer fc4096a: 9216 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout1: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc4096b: 4096 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout2: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc1000: 4096 inputs, 1000 nodes, Softmax act_fn
CostLayer cost: 1000 nodes, CrossEntropy cost_fn

2015-06-15 22:35:55,386 INFO:batch_norm - BatchNormalization set to train mode
Traceback (most recent call last):
File "/home/dsp/anaconda/bin/neon", line 6, in
exec(compile(open(file).read(), file, 'exec'))
File "/home/dsp/neon/bin/neon", line 240, in
experiment, result, status = main()
File "/home/dsp/neon/bin/neon", line 207, in main
experiment.initialize(backend)
File "/home/dsp/neon/neon/experiments/fit_predict_err.py", line 62, in initialize
super(FitPredictErrorExperiment, self).initialize(backend)
File "/home/dsp/neon/neon/experiments/fit.py", line 62, in initialize
self.model.initialize(backend)
File "/home/dsp/neon/neon/models/mlp.py", line 61, in initialize
ll.initialize(kwargs)
File "/home/dsp/neon/neon/layers/convolutional.py", line 39, in initialize
super(ConvLayer, self).initialize(kwargs)
File "/home/dsp/neon/neon/layers/layer.py", line 479, in initialize
self.bn.initialize(kwargs)
File "/home/dsp/neon/neon/transforms/batch_norm.py", line 90, in initialize
self._xhat = self.backend.zeros(self.in_shape, dtype=self.dtype)
File "/home/dsp/neon/neon/backends/gpu.py", line 582, in zeros
return self.ng.zeros(shape, dtype=dtype)
File "/home/dsp/anaconda/lib/python2.7/site-packages/nervanagpu/nervanagpu.py", line 483, in zeros
name=name, rounding=self.round_mode)._assign(0)
File "/home/dsp/anaconda/lib/python2.7/site-packages/nervanagpu/nervanagpu.py", line 298, in _assign
drv.memset_d32_async(self.gpudata,
AttributeError: 'module' object has no attribute 'memset_d32_async'

error when interrupt and resume training!

while I am runung 'neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
' on training about 50 hours , I had to restart my computer, when I resume the same command, the error happens, here the log:

ubgpu@ubgpu:~/github/neon/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
[sudo] password for ubgpu:
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-05-18 20:44:17,102 WARNING:neon - setting log level to: 20
2015-05-18 20:44:17,233 INFO:gpu - Initialized NervanaGPU with stochastic_round=None
2015-05-18 20:44:17,233 INFO:gpu - Seeding random number generator with: None
2015-05-18 20:44:17,234 INFO:init - NervanaGPU backend, RNG seed: None, numerr: None
2015-05-18 20:44:17,234 INFO:mlp - Layers:
ImageDataLayer d0: 3 x (224 x 224) nodes
ConvLayer conv1: 3 x (224 x 224) inputs, 64 x (55 x 55) nodes, RectLin act_fn
PoolingLayer pool1: 64 x (55 x 55) inputs, 64 x (27 x 27) nodes, Linear act_fn
ConvLayer conv2: 64 x (27 x 27) inputs, 192 x (27 x 27) nodes, RectLin act_fn
PoolingLayer pool2: 192 x (27 x 27) inputs, 192 x (13 x 13) nodes, Linear act_fn
ConvLayer conv3: 192 x (13 x 13) inputs, 384 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv4: 384 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv5: 256 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
PoolingLayer pool3: 256 x (13 x 13) inputs, 256 x (6 x 6) nodes, Linear act_fn
FCLayer fc4096a: 9216 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout1: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc4096b: 4096 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout2: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc1000: 4096 inputs, 1000 nodes, Softmax act_fn
CostLayer cost: 1000 nodes, CrossEntropy cost_fn

2015-05-18 20:44:17,234 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,236 INFO:val_init - Generating AutoUniformValGen values of shape (363, 64)
2015-05-18 20:44:17,237 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,238 INFO:val_init - Generating AutoUniformValGen values of shape (1600, 192)
2015-05-18 20:44:17,244 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,245 INFO:val_init - Generating AutoUniformValGen values of shape (1728, 384)
2015-05-18 20:44:17,255 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,256 INFO:val_init - Generating AutoUniformValGen values of shape (3456, 256)
2015-05-18 20:44:17,269 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,270 INFO:val_init - Generating AutoUniformValGen values of shape (2304, 256)
2015-05-18 20:44:17,279 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,280 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 9216)
2015-05-18 20:44:17,748 INFO:batch_norm - BatchNormalization set to train mode
2015-05-18 20:44:17,749 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 4096)
2015-05-18 20:44:17,959 INFO:val_init - Generating AutoUniformValGen values of shape (1000, 4096)
2015-05-18 20:44:18,016 INFO:fit - Unable to find saved model /home/ubgpu/data/I1K/I1K_alexnet_fp32_model.prm, starting over
2015-05-18 20:44:18,017 INFO:mlp - commencing model fitting
Traceback (most recent call last):
File "/usr/local/bin/neon", line 199, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 168, in main
result = experiment.run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit_predict_err.py", line 97, in run
super(FitPredictErrorExperiment, self).run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit.py", line 99, in run
self.model.fit(self.dataset)
File "/usr/local/lib/python2.7/dist-packages/neon/models/mlp.py", line 141, in fit
self.fprop()
File "/usr/local/lib/python2.7/dist-packages/neon/models/mlp.py", line 81, in fprop
ll.fprop(y)
File "/usr/local/lib/python2.7/dist-packages/neon/layers/layer.py", line 373, in fprop
self.batch_idx)
File "/usr/local/lib/python2.7/dist-packages/neon/datasets/imageset.py", line 314, in get_mini_batch
self.backend.subtract(self.inp_be, self.mean_be, self.inp_be)
File "/usr/local/lib/python2.7/dist-packages/neon/backends/gpu.py", line 643, in subtract
self.ng.subtract(left, right, out=out)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 801, in subtract
def subtract (self, a, b, out=None): return OpTreeNode.build("sub", a, b, out=out)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 915, in build
return OpTreeNode({ "op" : "assign" }, out, node).execute()
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 924, in execute
return call_compound_kernel(_get_rand_state(), _stack)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 835, in call_compound_kernel
kernel = _get_compound_kernel(tuple(type_args))
File "", line 2, in _get_compound_kernel
File "/usr/local/lib/python2.7/dist-packages/pycuda/tools.py", line 423, in context_dependent_memoize
result = func(_args)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 670, in _get_compound_kernel
module = _get_module(template, template_vals)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 313, in _get_module
return SourceModule(code, options=["--use_fast_math" ], keep=False) #,"-G"
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 251, in init
arch, code, cache_dir, include_dirs)
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 241, in compile
return compile_plain(source, options, keep, nvcc, cache_dir)
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 73, in compile_plain
checksum.update(preprocess_source(source, options, nvcc).encode("utf-8"))
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 47, in preprocess_source
result, stdout, stderr = call_capture_output(cmdline, error_on_nonzero=False)
File "/usr/local/lib/python2.7/dist-packages/pytools/prefork.py", line 197, in call_capture_output
return forker[0].call_capture_output(cmdline, cwd, error_on_nonzero)
File "/usr/local/lib/python2.7/dist-packages/pytools/prefork.py", line 54, in call_capture_output
% ( " ".join(cmdline), e))
pytools.prefork.ExecError: error invoking 'nvcc --preprocess --use_fast_math -arch sm_52 -I/usr/local/lib/python2.7/dist-packages/pycuda/cuda /tmp/tmpltSwQ9.cu --compiler-options -P': [Errno 2] No such file or directory

make install failed!

ubgpu@ubgpu:/github/neon$ . .venv/bin/activate
(.venv)ubgpu@ubgpu:/github/neon$ sudo make install
[sudo] password for ubgpu:
The directory '/home/ubgpu/.cache/pip/log' or its parent directory is not owned by the current user and the debug log has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ubgpu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ubgpu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.8.1 in /usr/lib/python2.7/dist-packages
Requirement already satisfied (use --upgrade to upgrade): PyYAML>=3.11 in /usr/local/lib/python2.7/dist-packages
The directory '/home/ubgpu/.cache/pip/log' or its parent directory is not owned by the current user and the debug log has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ubgpu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ubgpu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Requirement already satisfied (use --upgrade to upgrade): nose>=1.3.0 in /usr/lib/python2.7/dist-packages
Collecting Pillow>=2.5.0
/usr/local/lib/python2.7/dist-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Downloading Pillow-2.8.1.tar.gz (9.0MB)
100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 9.0MB 70kB/s
Collecting flake8>=2.2.2
Downloading flake8-2.4.0-py2.py3-none-any.whl
Collecting pep8-naming>=0.2.2
Downloading pep8_naming-0.2.2-py2.py3-none-any.whl
Requirement already satisfied (use --upgrade to upgrade): sphinx>=1.2.2 in /usr/lib/python2.7/dist-packages
Collecting sphinxcontrib-napoleon>=0.2.8
Downloading sphinxcontrib_napoleon-0.3.4-py2.py3-none-any.whl (50kB)
100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 53kB 276kB/s
Collecting scikit-learn>=0.15.2
Downloading scikit-learn-0.16.1.tar.gz (7.3MB)
100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 7.3MB 81kB/s
Collecting matplotlib>=1.4.0
Downloading matplotlib-1.4.3.tar.gz (50.4MB)
100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 50.4MB 14kB/s
Collecting imgworker>=0.2.5 from git+https://github.com/NervanaSystems/imgworker.git#egg=imgworker>=0.2.5
Cloning https://github.com/NervanaSystems/imgworker.git to /tmp/pip-build-b5tfve/imgworker
Collecting cudanet>=0.2.5 from git+https://github.com/NervanaSystems/cuda-convnet2.git#egg=cudanet>=0.2.5
Cloning https://github.com/NervanaSystems/cuda-convnet2.git to /tmp/pip-build-b5tfve/cudanet
Collecting pycuda>=2014.1
/usr/local/lib/python2.7/dist-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Downloading pycuda-2014.1.tar.gz (1.6MB)
100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 1.6MB 117kB/s
Complete output from command python setup.py egg_info:
*** WARNING: nvcc not in path.
*************************************************************
*** I have detected that you have not run configure.py.
*************************************************************
*** Additionally, no global config files were found.
*** I will go ahead with the default configuration.
*** In all likelihood, this will not work out.
***
*** See README_SETUP.txt for more information.
***
*** If the build does fail, just re-run configure.py with the
*** correct arguments, and then retry. Good luck!
*************************************************************
*** HIT Ctrl-C NOW IF THIS IS NOT WHAT YOU WANT
*************************************************************
Continuing in 1 seconds...
Traceback (most recent call last):
File "", line 20, in
File "/tmp/pip-build-b5tfve/pycuda/setup.py", line 216, in
main()
File "/tmp/pip-build-b5tfve/pycuda/setup.py", line 88, in main
conf["CUDA_INC_DIR"] = [join(conf["CUDA_ROOT"], "include")]
File "/usr/lib/python2.7/posixpath.py", line 77, in join
elif path == '' or path.endswith('/'):
AttributeError: 'NoneType' object has no attribute 'endswith'

----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-b5tfve/pycuda

make: *** [deps_install] Error 1
(.venv)ubgpu@ubgpu:~/github/neon$

simple GUI weight matrix viewer. Useful for images in particular.

Add CIFAR100 dataset

It would be good to be able to load both the coarse and fine labels.

Change GPU check

I've been making Docker images for deep learning libraries, with CPU-only and CUDA-enabled versions. I've successfully made the base version of neon , but my CUDA version doesn't get built properly. The issue in the Makefile is that nvidia-smi is used to check that a CUDA-enabled GPU is available, and whilst this isn't available in Docker build environments it has been possible to build libraries with the installed SDK. If a GPU isn't needed for the build, is it possible to change this test to check for something that is e.g. nvcc --version?

Outdated neon version on demo box

The version of neon on the demo box needs to be updated to the most recent version.

Support RMSProp

Hi, would you please support RMSProp as learning rule?

Data augmentation for images

Add image transformations (rotation, cropping, scaling etc.) to the ImageSet loading pipeline.

cannot run "neon --gpu cudanet examples/convnet/i1k-alexnet-fp32.yaml" on tegra k1

root@tegra-ubuntu:/home/hsl/neon# neon --gpu cudanet examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-06-19 12:47:19,332 WARNING:neon - setting log level to: 20
Traceback (most recent call last):
File "/usr/local/bin/neon", line 240, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 202, in main
device_id=args.device_id)
File "/usr/local/lib/python2.7/dist-packages/neon/backends/init.py", line 157, in gen_backend
raise RuntimeError("Can't find CUDA capable GPU")
RuntimeError: Can't find CUDA capable GPU

I use a tegra k1 GPU

Support differing batch sizes at train and test time

Due to the pre-allocation of intermediate matrices (like pre-activations), the settings used for batch sizes at training time, will automatically be expected at test time. This creates problems if we expect these to differ (ex. no mini-batch training where the batch size is the number of records).

Add GPU support for RNN/LSTM models

How to use my own dataset

I would like to use my own text data to train a model. I have read the information here http://neon.nervanasys.com/docs/latest/datasets.html

(1)Unfortunately I haven't understood how to transform text data into the required format, it seems that the datasets should be available with URL information. My data is in my local disk.
(2) What is the required data format, especially for text data?

Any help is more than welcome. Thanks.

Add example of deep network in neon without using yaml (direct python)

`sudo make install` doesn't work on ubuntu with GPU enabled

sudo on ubuntu resets the $PATH, so even if you have cuda installed and /usr/local/cuda/bin in your PATH, the makefile can't find nvcc.

Investigate and incorporate autodiff tool

Placeholder for supporting automatic differentiation

Support model serialization and saving predictions for RNN's and LSTM's

Attempts to save predictions (predictions: ['train', 'test']) or model parameters (serialized_path: "my_model.prm") on either the RNN or LSTM example networks result in a ValueError being raised instead of successful saving:

Traceback (most recent call last):
  File "bin/neon", line 240, in <module>
    experiment, result, status = main()
  File "bin/neon", line 208, in main
    result = experiment.run()
  File "/home/users/scott/repo/neon/neon/experiments/fit_predict_err.py", line 114, in run
    pred_set)
  File "/home/users/scott/repo/neon/neon/models/mlp.py", line 241, in predict_fullset
    reference[:, start:end] = self.cost_layer.get_reference()
  File "/home/users/scott/repo/neon/neon/backends/cpu.py", line 157, in __setitem__
    self._tensor[clean_key] = np.reshape(self._clean(value), req_shape)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 217, in reshape
    return _wrapit(a, 'reshape', newshape, order=order)
  File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 43, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: total size of new array must be unchanged

Can we run two processes on single GPU?

while I am running the second phase(training) of
neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
I start the first phase of
neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp16.yaml
it is OK!

however,
while the second phase of
neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
still ongoing, I launch the second phase of
neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp16.yaml
it reports the error:

ubgpu@ubgpu:~/github/neon/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp16.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp16.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float16'>
2015-05-17 14:50:39,937 WARNING:neon - setting log level to: 20
2015-05-17 14:50:40,856 INFO:gpu - Initialized NervanaGPU with stochastic_round=None
2015-05-17 14:50:40,856 INFO:gpu - Seeding random number generator with: None
2015-05-17 14:50:40,857 INFO:init - NervanaGPU backend, RNG seed: None, numerr: None
2015-05-17 14:50:40,858 INFO:mlp - Layers:
ImageDataLayer d0: 3 x (224 x 224) nodes
ConvLayer conv1: 3 x (224 x 224) inputs, 64 x (55 x 55) nodes, RectLin act_fn
PoolingLayer pool1: 64 x (55 x 55) inputs, 64 x (27 x 27) nodes, Linear act_fn
ConvLayer conv2: 64 x (27 x 27) inputs, 192 x (27 x 27) nodes, RectLin act_fn
PoolingLayer pool2: 192 x (27 x 27) inputs, 192 x (13 x 13) nodes, Linear act_fn
ConvLayer conv3: 192 x (13 x 13) inputs, 384 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv4: 384 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv5: 256 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
PoolingLayer pool3: 256 x (13 x 13) inputs, 256 x (6 x 6) nodes, Linear act_fn
FCLayer fc4096a: 9216 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout1: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc4096b: 4096 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout2: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc1000: 4096 inputs, 1000 nodes, Softmax act_fn
CostLayer cost: 1000 nodes, CrossEntropy cost_fn

2015-05-17 14:50:40,858 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,860 INFO:val_init - Generating AutoUniformValGen values of shape (363, 64)
2015-05-17 14:50:40,862 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,863 INFO:val_init - Generating AutoUniformValGen values of shape (1600, 192)
2015-05-17 14:50:40,870 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,871 INFO:val_init - Generating AutoUniformValGen values of shape (1728, 384)
2015-05-17 14:50:40,888 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,889 INFO:val_init - Generating AutoUniformValGen values of shape (3456, 256)
2015-05-17 14:50:40,914 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,915 INFO:val_init - Generating AutoUniformValGen values of shape (2304, 256)
2015-05-17 14:50:40,931 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:40,932 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 9216)
2015-05-17 14:50:42,483 INFO:batch_norm - BatchNormalization set to train mode
2015-05-17 14:50:42,484 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 4096)
2015-05-17 14:50:43,188 INFO:val_init - Generating AutoUniformValGen values of shape (1000, 4096)
2015-05-17 14:50:43,391 INFO:fit - Unable to find saved model /home/ubgpu/data/I1K/I1K_alexnet_fp16_model.prm, starting over
2015-05-17 14:50:43,393 INFO:mlp - commencing model fitting
Traceback (most recent call last):
File "/usr/local/bin/neon", line 199, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 168, in main
result = experiment.run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit_predict_err.py", line 97, in run
super(FitPredictErrorExperiment, self).run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit.py", line 99, in run
self.model.fit(self.dataset)
File "/usr/local/lib/python2.7/dist-packages/neon/models/mlp.py", line 141, in fit
self.fprop()
File "/usr/local/lib/python2.7/dist-packages/neon/models/mlp.py", line 81, in fprop
ll.fprop(y)
File "/usr/local/lib/python2.7/dist-packages/neon/layers/layer.py", line 373, in fprop
self.batch_idx)
File "/usr/local/lib/python2.7/dist-packages/neon/datasets/imageset.py", line 314, in get_mini_batch
self.backend.subtract(self.inp_be, self.mean_be, self.inp_be)
File "/usr/local/lib/python2.7/dist-packages/neon/backends/gpu.py", line 643, in subtract
self.ng.subtract(left, right, out=out)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 801, in subtract
def subtract (self, a, b, out=None): return OpTreeNode.build("sub", a, b, out=out)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 915, in build
return OpTreeNode({ "op" : "assign" }, out, node).execute()
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/nervanagpu.py", line 924, in execute
return call_compound_kernel(_get_rand_state(), _stack)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 835, in call_compound_kernel
kernel = _get_compound_kernel(tuple(type_args))
File "", line 2, in _get_compound_kernel
File "/usr/local/lib/python2.7/dist-packages/pycuda/tools.py", line 423, in context_dependent_memoize
result = func(_args)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 670, in _get_compound_kernel
module = _get_module(template, template_vals)
File "/usr/local/lib/python2.7/dist-packages/nervanagpu/float_ew.py", line 313, in _get_module
return SourceModule(code, options=["--use_fast_math" ], keep=False) #,"-G"
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 251, in init
arch, code, cache_dir, include_dirs)
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 241, in compile
return compile_plain(source, options, keep, nvcc, cache_dir)
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 73, in compile_plain
checksum.update(preprocess_source(source, options, nvcc).encode("utf-8"))
File "/usr/local/lib/python2.7/dist-packages/pycuda/compiler.py", line 47, in preprocess_source
result, stdout, stderr = call_capture_output(cmdline, error_on_nonzero=False)
File "/usr/local/lib/python2.7/dist-packages/pytools/prefork.py", line 197, in call_capture_output
return forker[0].call_capture_output(cmdline, cwd, error_on_nonzero)
File "/usr/local/lib/python2.7/dist-packages/pytools/prefork.py", line 54, in call_capture_output
% ( " ".join(cmdline), e))
pytools.prefork.ExecError: error invoking 'nvcc --preprocess --use_fast_math -arch sm_52 -I/usr/local/lib/python2.7/dist-packages/pycuda/cuda /tmp/tmpIRAwNd.cu --compiler-options -P': [Errno 2] No such file or directory
ubgpu@ubgpu:~/github/neon/neon$

cannot run "neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml"

ubgpu@ubgpu:/github/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-05-11 01:13:02,940 WARNING:neon - setting log level to: 20
2015-05-11 01:13:02,945 WARNING:init - nervanagpu not found, can't run via GPU
Traceback (most recent call last):
File "/usr/local/bin/neon", line 199, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 162, in main
device_id=args.device_id)
File "/usr/local/lib/python2.7/dist-packages/neon/backends/init.py", line 157, in gen_backend
raise RuntimeError("Can't find CUDA capable GPU")
RuntimeError: Can't find CUDA capable GPU
ubgpu@ubgpu:/github/neon$ which nvcc
/usr/local/cuda/bin/nvcc
ubgpu@ubgpu:~/github/neon$

my GPU is GTX970!!!

testing issue integration

I want a banana!

Define parallelism in YAML

Is there a way to define the parallelism in the YAML file ? It could be useful for example when optimizing the hyperparameters and using all the GPUs in the machine. (I am not sure if doing hyperopt -n THREADS will spread the tasks across the GPUs)

The batch_size cannot be set to 1?

I only modify the file neon/examples/convnet/mnist-small.yaml, setting batch_size: &bs 1. The log reports that:

    ValueError: shapes (400,1) and (32,1) not aligned: 1 (dim 1) != 32 (dim 0)

Refactoring Ideas

We're starting to embark upon a fairly substantial refactoring of the neon codebase in an effort to make it easier to use, and clean out some of the accumulated cruft.

We'd like to use this issue to get feedback from the broader community, and give a heads up that changes are on their way. What do you like, not like? Some of the things we're initially considering include:

Generic global backend and tensor with op-tree support, reusable buffers
reworking Layers to separate out state, support deferred initialization
Dataset revamping. Simplify loading, reading tabular data, DataWorker integration
Replace YAML with JSON? Make it easier to dump model specifications back out from loaded models

A tiny bug?

2015-05-25 01:59:04,478 INFO:mlp - commencing model fitting
2015-05-25 01:59:52,257 INFO:mlp - 0.0 training error: 7063.82959
2015-05-25 01:59:59,646 INFO:mlp - 0.1 training error: 6882.83252---!!!!!!
2015-05-25 02:00:07,203 INFO:mlp - 0.2 training error: 6941.65332
2015-05-25 02:00:14,705 INFO:mlp - 0.3 training error: 6853.07715
2015-05-25 02:00:22,345 INFO:mlp - 0.4 training error: 6767.80225
2015-05-25 02:00:29,725 INFO:mlp - 0.5 training error: 6804.38818
2015-05-25 02:00:40,553 INFO:mlp - 0.6 training error: 6707.24219
2015-05-25 02:00:54,380 INFO:mlp - 0.7 training error: 6803.98682
2015-05-25 02:01:07,099 INFO:mlp - 0.8 training error: 6675.78418
2015-05-25 02:01:20,458 INFO:mlp - 0.9 training error: 6502.79785
2015-05-25 02:01:33,850 INFO:mlp - 0.10 training error: 6715.76221----!!!!!!
2015-05-25 02:01:47,022 INFO:mlp - 0.11 training error: 6435.54492
2015-05-25 02:01:59,376 INFO:mlp - 0.12 training error: 6411.22900

as we can see:
the output '0.1 training error' at beginning and '0.10 training error' later,it doesn't make sense

should be the output '0.01 training error' at beginning

C Error when installing on Windows 8.1 & MinGW

I am on Windows 8.1. When I do

pip install .

make install

using MinGW for compilation, I get this error:

neon/backends/flexpt_dtype.c:410:5: error: initializer element is not constant

     PyObject_HEAD_INIT(&PyType_Type)

     ^

neon/backends/flexpt_dtype.c:410:5: error: (near initialization for 'PyFlexPt_Type.ob_type')

error: command 'd:\\MinGW\\bin\\gcc.exe' failed with exit status 1

I tried it with both Python 2.7 and 3.4 and keep getting the same error.

Question with "Adding a new type of Dataset"

I am thinking of adding new types of Dataset.

I follow the step described in the following link:

http://neon.nervanasys.com/docs/latest/datasets.html#adding-a-new-type-of-dataset

I also write a corresponding yaml file, which is "nb.yaml"

After running neon nb.yaml, it shows that:

AttributeError: 'module' object has no attribute 'NB'(my subclass is named NB)

What should I do next?

Add sentiment analysis example to neon

email from a user:

I am trying to get involved with text classification. I would like to start with the classical example of movie recomendations since there are a lot of examples using different kind of software
to illustrate And solve the problem.

Vowpal wobbit, scikit learn, stanford nlp etc...

https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews

I would also like to classify the typical sentiment140 dataset which contains tweet text
Which can be noisy sometimes.

http://help.sentiment140.com/for-students/

Refactor object initialization to reduce dependencies

Several objects now have separate initialization() methods that need to be called in a specific order after initial yaml parsing and object construction.

We should re-think how this is done to reduce the amount of coupling now present.

Multiple GPU usage

I would like to understand if NEON supports data parallelization on multiple GPUs?
I want to use multiple GPUs in order to increase mini-batch size...

The second question if MPI parallelization can be used with -gpu option?

No CUDA capable GPU installed. Forcing GPU=0

i had this error when i ran 'make install'. i have a titan x that is working fine. below is my setup.cfg. any help is much appreciated.

sudo make install
No CUDA capable GPU installed. Forcing GPU=0

[neon]
CPU = 1
GPU = nervanagpu
DIST = 0
DEV = 0

Create validation test for datasets

Having a way to create dataset splits (eg, validation split from the train set) in the YAML would be useful for reporting validation metrics, and to avoid over-fitting the test set when using spearmint.

Avoid computing the derivative of the activation function during inference.

The fprop() function computes the derivative of the activation function. This is unnecessary during inference.

Deal with remainder of minibatch and macrobatch

Any datasets with a number of records that are not an exact multiple of batch_size end up having the last batch dropped.

Error:neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml

ubgpu@ubgpu:~/github/neon/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-05-15 22:00:54,319 WARNING:neon - setting log level to: 20
2015-05-15 22:00:54,447 INFO:gpu - Initialized NervanaGPU with stochastic_round=None
2015-05-15 22:00:54,447 INFO:gpu - Seeding random number generator with: None
2015-05-15 22:00:54,448 INFO:init - NervanaGPU backend, RNG seed: None, numerr: None
2015-05-15 22:00:54,449 INFO:mlp - Layers:
ImageDataLayer d0: 3 x (224 x 224) nodes
ConvLayer conv1: 3 x (224 x 224) inputs, 64 x (55 x 55) nodes, RectLin act_fn
PoolingLayer pool1: 64 x (55 x 55) inputs, 64 x (27 x 27) nodes, Linear act_fn
ConvLayer conv2: 64 x (27 x 27) inputs, 192 x (27 x 27) nodes, RectLin act_fn
PoolingLayer pool2: 192 x (27 x 27) inputs, 192 x (13 x 13) nodes, Linear act_fn
ConvLayer conv3: 192 x (13 x 13) inputs, 384 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv4: 384 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv5: 256 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
PoolingLayer pool3: 256 x (13 x 13) inputs, 256 x (6 x 6) nodes, Linear act_fn
FCLayer fc4096a: 9216 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout1: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc4096b: 4096 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout2: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc1000: 4096 inputs, 1000 nodes, Softmax act_fn
CostLayer cost: 1000 nodes, CrossEntropy cost_fn

2015-05-15 22:00:54,449 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,450 INFO:val_init - Generating AutoUniformValGen values of shape (363, 64)
2015-05-15 22:00:54,452 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,453 INFO:val_init - Generating AutoUniformValGen values of shape (1600, 192)
2015-05-15 22:00:54,458 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,459 INFO:val_init - Generating AutoUniformValGen values of shape (1728, 384)
2015-05-15 22:00:54,469 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,470 INFO:val_init - Generating AutoUniformValGen values of shape (3456, 256)
2015-05-15 22:00:54,483 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,484 INFO:val_init - Generating AutoUniformValGen values of shape (2304, 256)
2015-05-15 22:00:54,492 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,493 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 9216)
2015-05-15 22:00:54,964 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,965 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 4096)
2015-05-15 22:00:55,175 INFO:val_init - Generating AutoUniformValGen values of shape (1000, 4096)
2015-05-15 22:00:55,229 WARNING:imageset - Batch dir cache not found in /home/ubgpu/data/I1K/imageset_batches/dataset_cache.pkl:
Press Y to create, otherwise exit: Y
/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py:137: RuntimeWarning: divide by zero encountered in log10
self.val_start = 10 ** int(np.log10(self.ntrain * 10))
Traceback (most recent call last):
File "/usr/local/bin/neon", line 199, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 168, in main
result = experiment.run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit_predict_err.py", line 97, in run
super(FitPredictErrorExperiment, self).run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit.py", line 70, in run
self.dataset.load()
File "/usr/local/lib/python2.7/dist-packages/neon/datasets/imageset.py", line 176, in load
self.bw.run()
File "/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py", line 215, in run
self.write_csv_files()
File "/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py", line 137, in write_csv_files
self.val_start = 10 ** int(np.log10(self.ntrain * 10))
OverflowError: cannot convert float infinity to integer
ubgpu@ubgpu:~/github/neon/neon$

Is there any default implementation of shuffling?

I realize that shuffling could be an essential part in NN. However, some people indicate that "If you are using a not-recurrent NN like a traditional MLP you don't NEED to shuffle dataset especially if you're using a batch learning algorithm".

I wonder whether there's a shuffling step before each epoch in neon? If not, how can I implement it?

http://stats.stackexchange.com/questions/40638/predicting-time-series-with-nns-should-the-data-set-be-shuffled

http://stats.stackexchange.com/questions/90874/how-can-stochastic-gradient-descent-avoid-the-problem-of-a-local-minimum

Add support for networks with Convolutional and Recurrent layers (e.g. captioning models)

MLP.predict_fullset doesn't work for saved models

I trained a mlp on CIFAR10, and deserialized it in a later script. (That step works fine, I have the correct weights and everything I need.) When I call mlp.predict_fullset, cudanet raises the error:

Traceback (most recent call last):
File "11_merge_predictions.py", line 202, in
mlp.predict_fullset(data, 'test')
File "/home/ubuntu/neon/neon/models/mlp.py", line 245, in predict_fullset
reference[:, start:end] = batch_refs
File "/home/ubuntu/neon/neon/backends/cc2.py", line 265, in setitem
self._tensor.set_col_slice(start, stop, value)
File "/home/ubuntu/cuda-convnet2/cudanet/cudanet.py", line 767, in set_col_slice
raise generate_exception(err_code)
cudanet.cudanet.CUDANetException: Incompatible matrix dimensions.

I am again predicting on the test set of CIFAR10, and the GPU I use is the one on AWS. (~4gb) It is also the same as the one I used for training. Further more, mlp.predict_generator works fine.

Simple Example mnist-small doesn't work

I installed neon on my Macbook (no gpu) and tried the simple mnist-small.yaml example and got the following error trace. This is my first time using Neon!

WARNING:neon.util.persist:deserializing object from: examples/mlp/mnist-small.yaml
2015-05-26 21:04:47,289 WARNING:neon - setting log level to: 20
2015-05-26 21:04:47,290 INFO:cpu - Seeding random number generator with: None
2015-05-26 21:04:47,298 INFO:init - CPU backend, RNG seed: None, numerr: None
2015-05-26 21:04:47,298 INFO:mlp - Layers:
DataLayer d0: 784 nodes
FCLayer h0: 784 inputs, 100 nodes, RectLin act_fn
FCLayer output: 100 inputs, 10 nodes, Logistic act_fn
CostLayer cost: 10 nodes, CrossEntropy cost_fn

2015-05-26 21:04:47,299 INFO:val_init - Generating GaussianValGen values of shape (100, 784)
2015-05-26 21:04:47,305 INFO:val_init - Generating GaussianValGen values of shape (10, 100)
2015-05-26 21:04:47,307 INFO:mnist - loading: train-images-idx3-ubyte
2015-05-26 21:04:47,667 INFO:mnist - loading: train-labels-idx1-ubyte
2015-05-26 21:04:47,675 INFO:mnist - loading: t10k-images-idx3-ubyte
2015-05-26 21:04:47,710 INFO:mnist - loading: t10k-labels-idx1-ubyte
2015-05-26 21:04:47,711 WARNING:dataset - Incompatible batch size. Discarding 16 samples...
2015-05-26 21:04:47,764 WARNING:dataset - Incompatible batch size. Discarding 96 samples...
2015-05-26 21:04:48,146 WARNING:dataset - Incompatible batch size. Discarding 16 samples...
2015-05-26 21:04:48,148 WARNING:dataset - Incompatible batch size. Discarding 96 samples...
2015-05-26 21:04:48,155 INFO:mlp - commencing model fitting
Traceback (most recent call last):
File "/usr/local/bin/neon", line 240, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 208, in main
result = experiment.run()
File "/Library/Python/2.7/site-packages/neon/experiments/fit_predict_err.py", line 99, in run
super(FitPredictErrorExperiment, self).run()
File "/Library/Python/2.7/site-packages/neon/experiments/fit.py", line 102, in run
self.model.fit(self.dataset)
File "/Library/Python/2.7/site-packages/neon/models/mlp.py", line 156, in fit
self.backend.add(error, self.cost_layer.get_cost(), error)
File "/Library/Python/2.7/site-packages/neon/layers/layer.py", line 289, in get_cost
scale_by_batchsize=scale_cost)
File "/Library/Python/2.7/site-packages/neon/transforms/cross_entropy.py", line 237, in apply_function
scale_by_batchsize=scale_by_batchsize)
File "/Library/Python/2.7/site-packages/neon/transforms/cross_entropy.py", line 62, in cross_entropy
return backend.sum(temp[0], axes=None, out=result)
File "/Library/Python/2.7/site-packages/neon/backends/cpu.py", line 764, in sum
np.sum(tsr._tensor, axis=axes, out=out._tensor, keepdims=True)
TypeError: sum() got an unexpected keyword argument 'keepdims'

Running mnist-small.yaml example after setup - getting error

Hi,
Just installed neon on Ubuntu 14 python 3.4 with the following command:
nir@nir-Satellite-Pro-A50-A:~/neon$ neon examples/mlp/mnist-small.yaml
and getting an error message:
Traceback (most recent call last):
File "/home/nir/anaconda3/bin/neon", line 240, in
experiment, result, status = main()
File "/home/nir/anaconda3/bin/neon", line 126, in main
experiment = deserialize(args.yaml_file)
File "/home/nir/anaconda3/lib/python3.4/site-packages/neon/util/persist.py", line 183, in deserialize
if not isinstance(load_path, file):
NameError: name 'file' is not defined
I check in the directory - this file exists.
Appreciate your assistance
thanks N

let NEON link to already-built cudanet

Hi,

I am trying to let NEON use cudanet as its backend. I already built cudanet from source and install the libraries into /usr/local/lib (and it is in LD_LIBRARY_PATH). But when I try to build neon with gpu=cudanet, it still tries to download cudanet and build it again. The build will fail because it can not find the helper_cuda.h, which i specified manually when building cudanet from source on my own. So I am wondering how to let NEON recognize already-built cudanet ?

Also, in NEON's Makefile, I found this

ifeq ($(GPU), cudanet)                                                                                                                                                                                                                       
    INSTALL_REQUIRES := $(INSTALL_REQUIRES) \                                                                                                                                                                                                  
      'git+https://github.com/NervanaSystems/cuda-convnet2.git\#egg=cudanet>=0.2.7' \                                                                                                                                                          
      'pycuda>=2014.1'                                                                                                                                                                                                                         
endif

So it seems NEON is trying to check whether cudanet is installed as a module within python. I checked with pip freeze | grep cudanet and nothing shows up. So I need to install cudanet model into python. Any thoughts on how to achieve that ?

Many thanks in advance.

Loading Data

Is there a tutorial on how to use your own data? I have some data and I want to link it to a MLP for example. I would appreciate any help here.