milakov / nnforge Goto Github PK

View Code? Open in Web Editor NEW

178.0 178.0 44.0 4.61 MB

Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends

Home Page: http://nnforge.org

C++ 78.43% Shell 0.01% Cuda 21.35% Makefile 0.20%

nnforge's People

Stargazers

Watchers

nnforge's Issues

Possible to compile in Windows ?

Is it possible to compile with Visual Studio or do I need some dependencies?

I encountered a problem when running gtsrb.can you help me !thank you !

i run gtsrb and encounter a Segmentation fault.
0x000000000044adb0 in std::vector >::size (this=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:533
533 { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
#0 0x000000000044adb0 in std::vector >::size (this=0x0)

at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:533
#1 0x00000000009c1ca3 in nnforge::testing_complete_result_set::testing_complete_result_set (this=0x7fffffffd4f0, ef=...,

actual_output_neuron_value_set=...) at testing_complete_result_set.cpp:33
#2 0x00000000009c54af in nnforge::validate_progress_network_data_pusher::push (this=0x10c142a0, task_state=...)

at validate_progress_network_data_pusher.cpp:46
#3 0x00000000009c615e in nnforge::complex_network_data_pusher::push (this=0x7fffffffd8d0, task_state=...)

at complex_network_data_pusher.cpp:33
#4 0x00000000009cde1b in nnforge::network_trainer::train (this=0x10c0b490, reader=..., peeker=..., progress_pusher=...,

pusher=...) at network_trainer.cpp:82
#5 0x000000000097b58a in nnforge::neural_network_toolset::train (this=0x7fffffffdbb0) at neural_network_toolset.cpp:1418
#6 0x000000000096b353 in nnforge::neural_network_toolset::do_action (this=0x7fffffffdbb0) at neural_network_toolset.cpp:125
#7 0x000000000041baa9 in main (argc=4, argv=0x7fffffffddd8) at gtsrb.cpp:44

need mnist example

Hi milakov
Can u write a sample about using convolution neural network to recognize handwritten digit (MNIST database)
i don't have GPU so if u write CPU version then it's great

Thanks

code style

Hi Max,

Sorry to bother but what code style are you using ?
I'm trying to make my changes consistent with your
style. Is it K&R (the default for C/C++ in nsight eclipse) ?
If you're using visual studio, is there any way you could export
the style sheet ? Thanks very much.

Juan

I encountered a problem when running gtsrb

#0, Epoch 1, Training MSE 0.799242 (49.2 GFLOPs, 175.46 seconds), Eta = 5.15e-06, Mu = 5.15e-04, LR (3.3e-03 2.8e-05, 2.5e-03 1.5e-05, 2.7e-03 1.5e-05, 1.0e-03 4.0e-06), Hessian (1.0e-03 1.9e-01, 1.9e-03 3.5e-01, 3.0e-03 3.5e-01, 1.3e-02 1.3e+00)

Program received signal SIGSEGV, Segmentation fault.
0x000000000044adb0 in std::vector<float, std::allocator >::size (this=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:533
533 { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
Missing separate debuginfos, use: debuginfo-install atk-1.28.0-2.el6.x86_64 cairo-1.8.8-3.1.el6.x86_64 expat-2.0.1-11.el6_2.x86_64 fontconfig-2.8.0-3.el6.x86_64 freetype-2.3.11-6.el6_1.7.x86_64 glib2-2.22.5-6.el6.x86_64 glibc-2.12-1.47.el6.x86_64 gstreamer-0.10.29-1.el6.x86_64 gstreamer-plugins-base-0.10.29-1.el6.x86_64 gtk2-2.18.9-6.el6.centos.x86_64 libX11-1.3-2.el6.x86_64 libXau-1.0.5-1.el6.x86_64 libXcomposite-0.4.1-2.el6.x86_64 libXcursor-1.1.10-2.el6.x86_64 libXdamage-1.1.2-1.el6.x86_64 libXext-1.1-3.el6.x86_64 libXfixes-4.0.4-1.el6.x86_64 libXi-1.3-3.el6.x86_64 libXinerama-1.1-1.el6.x86_64 libXrandr-1.3.0-4.el6.x86_64 libXrender-0.9.5-1.el6.x86_64 libgcc-4.4.6-3.el6.x86_64 libgomp-4.4.6-3.el6.x86_64 libjpeg-6b-46.el6.x86_64 libpng-1.2.46-1.el6_1.x86_64 libselinux-2.0.94-5.2.el6.x86_64 libstdc++-4.4.6-3.el6.x86_64 libtiff-3.9.4-1.el6_0.3.x86_64 libxcb-1.5-1.el6.x86_64 libxml2-2.7.6-4.el6.x86_64 pango-1.28.1-3.el6_0.centos.5.x86_64 pixman-0.18.4-1.el6_0.1.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x000000000044adb0 in std::vector<float, std::allocator >::size (this=0x0)

at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:533

#1 0x00000000009c1ca3 in nnforge::testing_complete_result_set::testing_complete_result_set (this=0x7fffffffd4f0, ef=...,

actual_output_neuron_value_set=...) at testing_complete_result_set.cpp:33

#2 0x00000000009c54af in nnforge::validate_progress_network_data_pusher::push (this=0x10c142a0, task_state=...)

at validate_progress_network_data_pusher.cpp:46

#3 0x00000000009c615e in nnforge::complex_network_data_pusher::push (this=0x7fffffffd8d0, task_state=...)

at complex_network_data_pusher.cpp:33

#4 0x00000000009cde1b in nnforge::network_trainer::train (this=0x10c0b490, reader=..., peeker=..., progress_pusher=...,

pusher=...) at network_trainer.cpp:82

#5 0x000000000097b58a in nnforge::neural_network_toolset::train (this=0x7fffffffdbb0) at neural_network_toolset.cpp:1418
#6 0x000000000096b353 in nnforge::neural_network_toolset::do_action (this=0x7fffffffdbb0) at neural_network_toolset.cpp:125
#7 0x000000000041baa9 in main (argc=4, argv=0x7fffffffddd8) at gtsrb.cpp:44

get_forward_flops batch-mode

Hey Maxim, I have two small questions.

(1) Is it possible to use the function .get_forward_flops in batch mode?
(2) To use the CUDA backend, is it enough to just call nnforge::cuda::cuda::init();

I wrote a 30 line code snippet called benchmark (which you can place in the examples folder),
https://github.com/soumith/convnet-benchmarks/blob/master/nnforge/benchmark/benchmark.cpp

Please let me know if the code looks right wrt CUDA being enabled and how I could benchmark it in batch-mode.

Something doesn't look right, as it prints out very poor numbers:

convolution_layer 3->96 11x11
:forward gflop/s: 0.96911
:backward gflop/s: 0.970447
:hessian  gflop/s: 0.970447

cross_entropy_layer not registered

Hi milakov,
I found cross_entropy_layer is not registered in nnforge.cpp so a CrossEntropy layer declared in the schema cannot be recognized, please register it(all layer types are registered except this one).
Thanks.

Can't compile absolute_layer_tester_cuda.cu

Maxim, I'm experience the following error during compilation of the package (I cloned master version):

nvcc -c absolute_layer_tester_cuda.cu -use_fast_math -DBOOST_NOINLINE='__attribute__ ((noinline))' -O3 -Xcompiler="-I/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/tr1/tr1 -I/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include -I/nfs/opt/cuda/include -I/home/vk/DataAnalysis/CUDA/cudnn-6.5-linux-x64-v2-rc3 -I/home/vk/DataAnalysis/CUDA/usr/include -ffast-math -march=native  -mfpmath=sse -msse2  -O3" -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code=\"sm_35,compute_35\" -o absolute_layer_tester_cuda.o
/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(49): warning: "cc" clobber ignored

/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(65): warning: "cc" clobber ignored

/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(91): warning: "cc" clobber ignored

/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp(75): warning: variable "tmp" was set but never used

/opt/rh/devtoolset-1.1/root/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include/tmmintrin.h(192): error: argument of type "__v1di" is incompatible with parameter of type "long long"

/opt/rh/devtoolset-1.1/root/usr/lib/gcc/x86_64-redhat-linux/4.7.2/include/tmmintrin.h(193): error: argument of type "__v1di" is incompatible with parameter of type "long long"

/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/random/detail/const_mod.hpp(193): warning: controlling expression is constant
          detected during:
            instantiation of "IntType boost::random::const_mod<IntType, m>::invert_euclidian0(IntType) [with IntType=uint64_t, m=281474976710656UL]"
(101): here
            instantiation of "IntType boost::random::const_mod<IntType, m>::invert(IntType) [with IntType=uint64_t, m=281474976710656UL]"
/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/random/linear_congruential.hpp(194): here
            instantiation of "void boost::random::linear_congruential_engine<IntType, a, c, m>::discard(uintmax_t) [with IntType=uint64_t, a=25214903917UL, c=11UL, m=281474976710656UL]"
/home/vk/DataAnalysis/CUDA/boost_1_57_0/install//include/boost/random/linear_congruential.hpp(405): here

2 errors detected in the compilation of "/tmp/tmpxft_00007caa_00000000-9_absolute_layer_tester_cuda.compute_35.cpp1.ii".
make: *** [absolute_layer_tester_cuda.o] Error 2

Any ideas how to fix it?
Thanks,
Valentin.

Getting a failed assertion in stream_duplicator ctor

This is more like a mailing list question but not sure if there is a mailing list, so posting here...

I have been using nnForge (with gtsrb) and had to modify it's build environment slightly for my work. I started getting this error -
Forking output log to /home//nnforge-/working_data/gtsrb/log.txt...

gtsrb: /usr/include/boost/iostreams/tee.hpp:176: std::streamsize boost::iostreams::tee_device<Device, Sink>::write(const char_type*, std::streamsize) [with Device = std::basic_ostream; Sink = boost::filesystem::basic_ofstream; std::streamsize = long int; boost::iostreams::tee_device<Device, Sink>::char_type = char]: Assertion `result1 == n && result2 == n' failed.
Aborted (core dumped)
I was using v1.1.10.

I downloaded the latest release and still getting the same.
I am still pointing to the input_data and working_data directories under my v1.1.10 directory but I haven't modified anything in these directories.

The backtrace from gdb is below -
Program received signal SIGABRT, Aborted.
0x00007fffeb058cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fffeb058cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fffeb05c0d8 in __GI_abort () at abort.c:89
#2 0x00007fffeb051b86 in __assert_fail_base (fmt=0x7fffeb1a33d0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x779b16 "result1 == n && result2 == n",

file=file@entry=0x779ab8 "/usr/include/boost/iostreams/tee.hpp", line=line@entry=176, 
function=function@entry=0x779b80 <boost::iostreams::tee_device<std::ostream, boost::filesystem::basic_ofstream<char, std::char_traits<char> > >::write(char const*, long)::__PRETTY_FUNCTION__> "std::streamsize boost::iostreams::tee_device<Device, Sink>::write(const char_type*, std::streamsize) [with Device = std::basic_ostream<char>; Sink = boost::filesystem::basic_ofstream<char>; std::strea"...)
at assert.c:92

#3 0x00007fffeb051c32 in __GI___assert_fail (assertion=0x779b16 "result1 == n && result2 == n", file=0x779ab8 "/usr/include/boost/iostreams/tee.hpp", line=176,

function=0x779b80 <boost::iostreams::tee_device<std::ostream, boost::filesystem::basic_ofstream<char, std::char_traits<char> > >::write(char const*, long)::__PRETTY_FUNCTION__> "std::streamsize boost::iostreams::tee_device<Device, Sink>::write(const char_type*, std::streamsize) [with Device = std::basic_ostream<char>; Sink = boost::filesystem::basic_ofstream<char>; std::strea"...) at assert.c:101

#4 0x00000000006d79ab in boost::iostreams::detail::indirect_streambuf<boost::iostreams::tee_device<std::ostream, boost::filesystem::basic_ofstream<char, std::char_traits > >, std::char_traits, std::allocator, boost::iostreams::output>::sync() ()
#5 0x00007fffebbb619e in std::ostream::flush() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00000000006d72e4 in nnforge::stream_duplicator::stream_duplicator(boost::filesystem::path const&) ()
#7 0x00000000006ae7e8 in nnforge::neural_network_toolset::parse(int, char**) ()
#8 0x000000000041aef0 in main ()

This is my gtsrb.cfg file -
input_data_folder=/home//nnforge-/input_data/gtsrb
working_data_folder=/home//nnforge-/working_data/gtsrb
training_epoch_count=40
learning_rate=0.03
training_algo=sgd
momentum=0.6
batch_size=16
learning_rate_decay_tail=30
learning_rate_decay_rate=0.93

Any help is really appreciated, Thanks !

Undefined references

When compiling, these errors occur:

+ cd ../..
+ cd examples
+ for i in './*'
+ '[' -d ./Example.mk ']'
+ for i in './*'
+ '[' -d ./gtsrb ']'
+ cd ./gtsrb
+make
g++ -DNNFORGE_CUDA_BACKEND_ENABLED -I../.. -I/usr/include -I/usr/local/cuda/include -I/usr/local/cudnn -I/usr/include -fopenmp -ffast-math -march=native  -mfpmath=sse -msse2  -O3 -std=c++11 -DNNFORGE_CPP11COMPILER   -c -o gtsrb.o gtsrb.cpp
g++ -DNNFORGE_CUDA_BACKEND_ENABLED -I../.. -I/usr/include -I/usr/local/cuda/include -I/usr/local/cudnn -I/usr/include -fopenmp -ffast-math -march=native  -mfpmath=sse -msse2  -O3 -std=c++11 -DNNFORGE_CPP11COMPILER   -c -o gtsrb_toolset.o gtsrb_toolset.cpp
g++ -o ../../bin/gtsrb gtsrb.o gtsrb_toolset.o -lnnforge_cuda -lnnforge_plain -lnnforge -L../../lib -L/usr/lib -lboost_thread -lboost_regex -lboost_chrono -lboost_filesystem -lboost_program_options -lboost_random -lboost_system -lboost_date_time -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -L/usr/local/cudnn -lcudnn -lcurand -lcusparse -lcublas -lcudart -L/usr/lib -lopencv_highgui -lopencv_imgproc -lopencv_core -fopenmp
gtsrb_toolset.o: In function `gtsrb_toolset::gtsrb_toolset(std::shared_ptr<nnforge::factory_generator>)':
gtsrb_toolset.cpp:(.text+0x13db): undefined reference to `nnforge::neural_network_toolset::neural_network_toolset(std::shared_ptr<nnforge::factory_generator>)'
gtsrb_toolset.o: In function `gtsrb_toolset::get_schema() const':
gtsrb_toolset.cpp:(.text+0x159c): undefined reference to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x16d1): undefined reference to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x1738): undefined reference to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x177b): undefined reference to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x1845): undefined reference to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.o:gtsrb_toolset.cpp:(.text+0x1974): more undefined references to `nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)' follow
gtsrb_toolset.o: In function `gtsrb_toolset::prepare_training_data()':
gtsrb_toolset.cpp:(.text+0x2f7f): undefined reference to `nnforge::supervised_data_stream_writer::supervised_data_stream_writer(std::shared_ptr<std::ostream>, nnforge::layer_configuration_specific const&, nnforge::layer_configuration_specific const&, nnforge::neuron_data_type::input_type)'
gtsrb_toolset.cpp:(.text+0x3bb1): undefined reference to `nnforge::supervised_data_stream_writer::supervised_data_stream_writer(std::shared_ptr<std::ostream>, nnforge::layer_configuration_specific const&, nnforge::layer_configuration_specific const&, nnforge::neuron_data_type::input_type)'
gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0x90): undefined reference to `nnforge::neural_network_toolset::get_validators_for_training(std::shared_ptr<nnforge::network_schema>)'
gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0xb0): undefined reference to `nnforge::neural_network_toolset::run_test_with_unsupervised_data(std::vector<std::shared_ptr<nnforge::output_neuron_value_set>, std::allocator<std::shared_ptr<nnforge::output_neuron_value_set> > >&)'
gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0x110): undefined reference to `nnforge::neural_network_toolset::get_samples_for_snapshot(std::shared_ptr<nnforge::network_data>, std::shared_ptr<nnforge::unsupervised_data_reader>, unsigned int)'
collect2: error: ld returned 1 exit status
make: *** [../../bin/gtsrb] Error 1

Also,

+ cd ..
+ for i in './*'
+ '[' -d ./image_classifier_demo ']'
+ cd ./image_classifier_demo
+ make
g++ -o ../../bin/image_classifier_demo image_classifier_demo.o image_classifier_demo_toolset.o -lnnforge_cuda -lnnforge_plain -lnnforge -L../../lib -L/usr/lib -lboost_thread -lboost_regex -lboost_chrono -lboost_filesystem -lboost_program_options -lboost_random -lboost_system -lboost_date_time -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -L/usr/local/cudnn -lcudnn -lcurand -lcusparse -lcublas -lcudart -L/usr/lib -lopencv_highgui -lopencv_imgproc -lopencv_core -fopenmp
image_classifier_demo_toolset.o:(.rodata._ZTV29image_classifier_demo_toolset[_ZTV29image_classifier_demo_toolset]+0x68): undefined reference to `nnforge::neural_network_toolset::get_dropout_rate_map() const'
image_classifier_demo_toolset.o:(.rodata._ZTV29image_classifier_demo_toolset[_ZTV29image_classifier_demo_toolset]+0x140): undefined reference to `nnforge::neural_network_toolset::get_initial_data_reader_for_training() const'
collect2: error: ld returned 1 exit status
make: *** [../../bin/image_classifier_demo] Error 1

error: ‘layer_smart_ptr’ does not name a type

I'm trying to compile nnforge on Linux Mint (which itself is an Ubuntu derivative) and I'm running into some errors (Sorry if I miss something obvious, my C++ is a bit rusty, any help is kindly appreciated):

I ran make in the nnforge sub-directory, but the same error also occurs when ran from the make_all.sh.

twiddles ~/milakov-nnForge-f6bbac2/nnforge $ make
g++ -I/usr/lib/include/boost/tr1/tr1 -I/usr/lib/include -I/usr/lib/include -std=c++11 -ffast-math -march=native -mfpmath=sse -msse2  -O3   -c -o absolute_layer.o absolute_layer.cpp
In file included from layer.h:21:0,
                 from absolute_layer.h:19,
                 from absolute_layer.cpp:17:
layer_data.h:60:10: error: ‘tr1’ in namespace ‘std’ does not name a type
layer_data.h:61:10: error: ‘tr1’ in namespace ‘std’ does not name a type
layer_data.h:62:22: error: ‘layer_data_smart_ptr’ was not declared in this scope
layer_data.h:62:42: error: template argument 1 is invalid
layer_data.h:62:42: error: template argument 2 is invalid
layer_data.h:62:59: error: invalid type in declaration before ‘;’ token
In file included from layer.h:22:0,
                 from absolute_layer.h:19,
                 from absolute_layer.cpp:17:
rnd.h:23:10: error: ‘tr1’ in namespace ‘std’ does not name a type
rnd.h:28:10: error: ‘random_generator’ does not name a type
rnd.h:30:10: error: ‘random_generator’ does not name a type
In file included from absolute_layer.h:19:0,
                 from absolute_layer.cpp:17:
layer.h:40:11: error: ‘tr1’ in namespace ‘std’ does not name a type
In file included from absolute_layer.h:19:0,
                 from absolute_layer.cpp:17:
layer.h:69:3: error: ‘layer_data_smart_ptr’ does not name a type
layer.h:77:4: error: ‘random_generator’ has not been declared
layer.h:89:10: error: ‘tr1’ in namespace ‘std’ does not name a type
layer.h:90:10: error: ‘tr1’ in namespace ‘std’ does not name a type
layer.h:91:22: error: ‘const_layer_smart_ptr’ was not declared in this scope
layer.h:91:43: error: template argument 1 is invalid
layer.h:91:43: error: template argument 2 is invalid
layer.h:91:61: error: invalid type in declaration before ‘;’ token
In file included from absolute_layer.cpp:17:0:
absolute_layer.h:29:11: error: ‘layer_smart_ptr’ does not name a type
absolute_layer.cpp:38:2: error: ‘layer_smart_ptr’ does not name a type
make: *** [absolute_layer.o] Error 1

My Settings.mk file (I added -std=c++11 because of another error and disabled cuda and netcdf):

BUILD_MODE=release
ENABLE_CUDA_BACKEND=no
ENABLE_CUDA_PROFILING=no
BOOST_PATH=/usr/lib
OPENCV_PATH=/usr/lib
NETCDF_INSTALLED=no
NETCDF_PATH=
CUDA_PATH=/usr/local/cuda
NVCC=nvcc
NNFORGE_PATH=../..
NNFORGE_INPUT_DATA_PATH=/home/max/nnforge/input_data
NNFORGE_WORKING_DATA_PATH=/home/max/nnforge/working_data

BOOST_LIBS=-lboost_thread-mt -lboost_regex-mt -lboost_chrono-mt -lboost_filesystem-mt -lboost_program_options-mt -lboos$
OPENCV_LIBS=-lopencv_highgui -lopencv_imgproc -lopencv_core
NETCDF_LIBS=-lnetcdf

CPP_FLAGS_COMMON=-std=c++11 -ffast-math -march=native -mfpmath=sse -msse2 # -mavx
CPP_FLAGS_DEBUG_MODE=-g
CPP_FLAGS_RELEASE_MODE=-O3

CPP_FLAGS_OPENMP=-fopenmp
LD_FLAGS_OPENMP=-fopenmp

CUDA_FLAGS_COMMON=-use_fast_math
CUDA_FLAGS_ARCH_FERMI=-gencode=arch=compute_20,code=sm_20
CUDA_FLAGS_ARCH_KEPLER=-gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code=\"sm_35,compute_35\"
CUDA_FLAGS_DEBUG_MODE=-g -lineinfo
CUDA_FLAGS_RELEASE_MODE=-O3

Access violation in cublasSgemm when running the GTSRB example

I'm getting a CUDA access violation (wrong access to global GPU memory) when running the GTSRB example. I traced it down to a cublasSgemm call at fully_connected_layer_hessian_cuda.cu line 130. The AV happens the first time the call is made:

data[0]->get_size() = 1280000
input_neurons_buffer->get_size() = 9676800
output_neurons_buffer->get_size() = 1209600
cublasSgemm(0x06ED4250, 1, 0, 200, 1512, 1600, 1, 0x41080000, 1600, 0x7CAC
0000, 1600, 1, 80240000, 200)

Memory Checker detected 1024 access violations.
error = access violation on load (global memory)
gridid = 4592
blockIdx = {0,0,0}
threadIdx = {0,0,0}
address = 0xffffffff80240000
accessSize = 4

GPU Stack trace
>   CUmodule 16426e98 - [0] sgemm_sm30_ldg_tex_tn_64x16x128x16x32   CUDA

The numbers passed to cublasSgemm seem quite OK to me but I'm no expert in BLAS/CUBLAS and the parameters are somewhat confusing to me.

The CPU version of nnforge seems to be working properly.

My system: Windows 8.1, Cuda 5.5, driver 332.21, GPU nVidia GTX 660M (kepler), 2GB GPU memory, compiling under VS2012 (all 32bit/64bit Release/Debug combinations crash), OpenMP enabled.

I use the following command line to run the example: gtsrb --config gtsrb.cfg train -N 10 -G 0.5

initializing with random weights

Hi Max,

I want to initialize my neural networks with random weights in a range defined by the user.
How are they initialized currently? I think i will need to do some modifications. Could you give me a hint to where to start? Thank you.

How to run image_classifier_demo?

I have downloaded the ImageNet dataset from the Googe Drive location mentioned in the readme.
The default action of 'demo' doesn't work because I am running on a server.
I tried 'train' but that gives the following error-
Exception caught: basic_ios::clear

Full log -
Forking output log to /opt/share/users/anshuman/nnForge/working_data/image_classifier_demo/log.txt...

2015-02-11 12:06:39
action=train
input_data_folder="/opt/share/users/anshuman/nnForge/input_data/image_classifier_demo"
working_data_folder="/opt/share/users/anshuman/nnForge/working_data/image_classifier_demo"
ann_count=1
training_epoch_count=50
snapshot_count=100
snapshot_extension=jpg
snapshot_extension_video=avi
snapshot_scale=1
snapshot_video_fps=5
snapshot_ann_index=0
snapshot_ann_type=image
learning_rate=0.02
learning_rate_decay_tail=0
learning_rate_decay_rate=0.5
learning_rate_rise_head=0
learning_rate_rise_rate=0.1
batch_offset=0
test_validate_ann_index=-1
snapshot_data_set=training
profile_updater_entry_count=1
check_gradient_weights=::
check_gradient_threshold=1.05
check_gradient_base_step=0.001
training_algo=sgd
dump_resume=1
load_resume=0
epoch_count_in_training_set=1
weight_decay=0
batch_size=1
momentum=0
shuffle_block_size=-1
report_stats=0
cuda_max_global_memory_usage_ratio,G=0.9

cuda_device_id,D=0

Exception caught: basic_ios::clear

nnForge train/test with imagenet dataset

@milakov I am curious about the effort required to do this.
Sorry for creating an issue as this is a mailing-list question.

validate action loads all validation data at once ?

Hi Max,

From memory usage, it seems that for the "validate" action, validating.sdt is loaded into memory at once, which doesn't seem to be the case for training. Would you be able to provide a hint on how to make validate stream through validating.sdt as opposed to loading it at once ?
Thanks very much.

Best,

Juan

stream_duplicator causes assertion failure on Windows

In stream_duplicator constructor, the first cout output statement causes assertion failure in boost/iostreams/tee.hpp, line 176(I'm using Boost 1.57, Visual C++ 2013).

    std::streamsize write(const char_type* s, std::streamsize n)
    {
        BOOST_STATIC_ASSERT((
            is_convertible<
                BOOST_DEDUCED_TYPENAME iostreams::category_of<Device>::type, output
            >::value
        ));
        std::streamsize result1 = iostreams::write(dev_, s, n);
        std::streamsize result2 = iostreams::write(sink_, s, n);
        (void) result1; // Suppress 'unused variable' warning.
        (void) result2;
        BOOST_ASSERT(result1 == n && result2 == n);
        return n;
    }

I placed a breakpoint on the BOOST_ASSERT line and got result1: 0, result2: 41, n: 41 in the debugger window. This is probably because outputs to the log file is buffered and not flushed before iostreams::write returns. In Release mode cout is correctly redirected to the log file.

Please consider using freopen function to redirect standard output to a file.

FeedForward example

Hello,

I built the lib and went through the examples, and I was wondering if it would be too much to ask for a simple Feedforward example.

Many thanks,
A.

Minor build issues with v2.0.0

I am building on an Ubuntu 14.04 box.
gcc 4.8.4
boost 1.54.0
opencv 2.4.8
protobuf 8.0.0

MATIO and NETCDF disabled
CUDA enabled

Had to set CPP11COMPILER to yes from no(default). The first C++11 failure starts from nnforge/data_visualizer.cpp

Build of nnforge/plain fails for not finding memcpy starting with nnforge/plain/dropout_layer_tester_plain.cpp
Fixed by including in nnforge/layer.h

cuDNN distributions now have inc/ and lib64/ so includes like <cudnn.h> fail with the current Settings.mk. Copying /usr/local/cuDNN/inc/cudnn.h and /usr/local/cuDNN/lib64/libcudnn.so to directly under /usr/local/cuDNN is one possible fix.

Otherwise builds fine. 👍 for an awesome new release !

nvcc compile error ‘noinline’ was not declared

Env: CUDA 5.5, gcc 4.8.2, boost 1.55

Info:
nvcc -c absolute_layer_hessian_cuda.cu -use_fast_math -O3 -Xcompiler="-I/usr/local/include/boost/tr1/tr1 -I/usr/local/include -I/usr/local/cuda/include -I/usr/local/include -ffast-math -mfpmath=sse -msse3 -O3" -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code="sm_35,compute_35" -o absolute_layer_hessian_cuda.o
/usr/local/include/boost/assert.hpp:102:47: error: ‘noinline’ was not declared in this scope
BOOST_NOINLINE void assertion_failed_msg(CharT const * expr, char const * msg, char const * function,

Solution:
Change Settings.mk
CUDA_FLAGS_COMMON=-use_fast_math -DBOOST_NOINLINE='attribute ((noinline))'

Invalid buffer allocation in network_updater_plain.cpp

In line 132:

            std::vector<float> actual_output_buf(max_entry_count * input_neuron_count);

I think it should be output_neuron_count

Updater in CPU backend is broken when running multiple ANNs

galaxy_zoo.exe profile_updater --profile_updater_entry_count 1 --learning_rate 1 -N 2

gives weired results

Issue with -std=c++11

I tried setting CPP11COMPILER to yes after forking nnForge and it seems that line 75 in Main.mk is broken. The rule is -
NVCCFLAGS+=-DNNFORGE_CPP11COMPILER -std=c++11
and I get the error -
nvcc -c absolute_layer_tester_cuda.cu -use_fast_math -DBOOST_NOINLINE='attribute ((noinline))' -g -lineinfo -DENABLE_CUDA_PROFILING -Xcompiler="-I/usr/lib/x86_64-linux-gnu/include -I/usr/local/cuda-6.0/include -I/home/agoswami/cudnn-6.5-linux-x64-v2-rc2 -I/usr/lib/x86_64-linux-gnu/include -ffast-math -march=native -mfpmath=sse -msse2 -g -DENABLE_CUDA_PROFILING" -DNNFORGE_CPP11COMPILER -std=c++11 -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code="sm_35,compute_35" -o absolute_layer_tester_cuda.o
nvcc fatal : Unknown option 'std'
make: *** [absolute_layer_tester_cuda.o] Error 255

So I tried changing the rule to -
NVCCFLAGS+=-DNNFORGE_CPP11COMPILER -Xcompiler="-std=c++11"
but that gives a bunch of errors in all *.cu file that uses c++11 (my nvcc and gcc are => Cuda compilation tools, release 6.0, V6.0.1, gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2; also tried Cuda compilation tools, release 7.0, V7.0.17 and Cuda compilation tools, release 6.5, V6.5.12). Errors are -
nvcc -c hyperbolic_tangent_layer_updater_cuda.cu -use_fast_math -DBOOST_NOINLINE='attribute ((noinline))' -g -lineinfo -DENABLE_CUDA_PROFILING -Xcompiler="-I/usr/lib/x86_64-linux-gnu/include -I/usr/local/cuda-6.0/include -I/home/agoswami/cudnn-6.5-linux-x64-v2-rc2 -I/usr/lib/x86_64-linux-gnu/include -ffast-math -march=native -mfpmath=sse -msse2 -g -DENABLE_CUDA_PROFILING" -DNNFORGE_CPP11COMPILER -Xcompiler="-std=c++11" -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code="sm_35,compute_35" -o hyperbolic_tangent_layer_updater_cuda.o
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined
.......

Then I tried disabling CPP11COMPILER by setting it to "no" which gives the following errors (my BOOST version is 1.54.0) -
nvcc -c hyperbolic_tangent_layer_updater_cuda.cu -use_fast_math -DBOOST_NOINLINE='attribute ((noinline))' -g -lineinfo -DENABLE_CUDA_PROFILING -Xcompiler="-I/usr/lib/x86_64-linux-gnu/include/boost/tr1/tr1 -I/usr/lib/x86_64-linux-gnu/include -I/usr/local/cuda/include -I/home/agoswami/cudnn-6.5-linux-x64-v2-rc2 -I/usr/lib/x86_64-linux-gnu/include -ffast-math -march=native -mfpmath=sse -msse2 -g -DENABLE_CUDA_PROFILING" -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code="sm_35,compute_35" -o hyperbolic_tangent_layer_updater_cuda.o
In file included from /usr/include/c++/4.8/random:35:0,
from ../nn_types.h:20,
from ../layer_data.h:19,
from ../layer.h:21,
from layer_tester_cuda.h:19,
from absolute_layer_tester_cuda.h:19,
from absolute_layer_tester_cuda.cu:17:
/usr/include/c++/4.8/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
#error This file requires compiler and library support for the
^

Finally I tried with CPP11COMPILER set to "yes" but the NVCCFLAGS modified to drop the -std=c++11, but that gives the same error as above.

Am I missing something?

Reimplement nn language model (bengio et al 2003)

Hello,

Thanks for your great tool.

I am trying to reimplement the nn language model
described in bengio et al 2003
with your library.

The vocabulary has for example 5000 words.
Each word is represented by a vector of size 5000 with all zeros except
at the index for that word which contains a 1.
The input is a vector of size 15000 (3*5000) that represents 3 words.
The first layer is a "shared mapping layer". This means that a block diagonal matrix is applied to the input vector and there is no nonlinear activation function (i.e. the activation function is the identity function). For example, we want to represent each word as a dense vector of size 100. Then, the matrix will have 3 blocks of size 100 x 5000.

Would you be able to provide a hint about how to define this shared mapping layer in the get_schema() method ?

Thank you very much,

Juan

cuda fails to build when c++11

Hey,

I am trying to build nnForge on my machine:
Ubuntu 14.04 64-bit with CUDA 6.0

I am running into two different issues:
(1) When CPP11COMPILER=yes the CUDA modules fail to build with the error:

nvcc -c absolute_layer_hessian_cuda.cu -use_fast_math -DBOOST_NOINLINE='__attribute__ ((noinline))' -O3 -DENABLE_CUDA_PROFILING -Xcompiler="-I/usr/local/include -I/usr/local/cuda/include -I/usr/local/include -ffast-math -march=native -mfpmath=sse -msse2  -O3 -DENABLE_CUDA_PROFILING" -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code=\"sm_35,compute_35\" -o absolute_layer_hessian_cuda.o
In file included from /usr/include/c++/4.8/random:35:0,
                 from ../nn_types.h:20,
                 from ../layer_data.h:20,
                 from ../layer.h:21,
                 from layer_hessian_cuda.h:19,
                 from absolute_layer_hessian_cuda.h:19,
                 from absolute_layer_hessian_cuda.cu:17:
/usr/include/c++/4.8/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
 #error This file requires compiler and library support for the \

(2) With CPP11COMPILER=no the core library itself fails to build with the error:

g++ -I/usr/local/include/boost/tr1/tr1 -I/usr/local/include -I/usr/local/include -ffast-math -march=native -mfpmath=sse -msse2  -O3 -DENABLE_CUDA_PROFILING   -c -o absolute_layer.o absolute_layer.cpp
In file included from /usr/include/c++/4.8/random:35:0,
                 from nn_types.h:20,
                 from layer_data.h:20,
                 from layer.h:21,
                 from absolute_layer.h:19,
                 from absolute_layer.cpp:17:
/usr/include/c++/4.8/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
 #error This file requires compiler and library support for the \
  ^
In file included from layer_data.h:20:0,
                 from layer.h:21,
                 from absolute_layer.h:19,
                 from absolute_layer.cpp:17:
nn_types.h:41:25: error: ‘tr1’ in namespace ‘std’ does not name a type
 #define nnforge_mt19937 std::tr1::mt19937
                         ^
rnd.h:23:10: note: in expansion of macro ‘nnforge_mt19937’
  typedef nnforge_mt19937 random_generator;
          ^

Error compiling examples - image_classifier_demo and gtsrb

I started by building the sources under nnforge, nnforge/plain and nnforge/cuda.
In my Settings.mk, I have -
ENABLE_CUDA_BACKEND=yes
CPP11COMPILER=no
NETCDF_INSTALLED=no
MATIO_INSTALLED=no

When I run "make" under examples/image_classifier_demo, I get the following errors (symbols not found in the generated libraries)-
g++ -o ../../bin/image_classifier_demo image_classifier_demo.o image_classifier_demo_toolset.o -lnnforge_cuda -lnnforge_plain -lnnforge -L../../lib -L/usr/lib -lboost_thread -lboost_regex -lboost_chrono -lboost_filesystem -lboost_program_options -lboost_random -lboost_system -lboost_date_time -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcusparse -lcublas -lcudart -L/usr/lib/x86_64-linux-gnu//lib -lopencv_highgui -lopencv_imgproc -lopencv_core -fopenmp
image_classifier_demo_toolset.o: In function image_classifier_demo_toolset::image_classifier_demo_toolset(std::shared_ptr<nnforge::factory_generator>)': image_classifier_demo_toolset.cpp:(.text+0x6dc): undefined reference tonnforge::neural_network_toolset::neural_network_toolset(std::shared_ptrnnforge::factory_generator)'
image_classifier_demo_toolset.o: In function image_classifier_demo_toolset::run_classifier_loop()': image_classifier_demo_toolset.cpp:(.text+0x19e7): undefined reference tonnforge::network_tester::set_data(std::shared_ptrnnforge::network_data)'
image_classifier_demo_toolset.o: In function image_classifier_demo_toolset::init_input_config()': image_classifier_demo_toolset.cpp:(.text+0x255d): undefined reference tonnforge::network_schema::operator std::vector<std::shared_ptr<nnforge::layer const>, std::allocator<std::shared_ptr<nnforge::layer const> > > const&() const'
image_classifier_demo_toolset.o:(.rodata._ZTV29image_classifier_demo_toolset[_ZTV29image_classifier_demo_toolset]+0x98): undefined reference to nnforge::neural_network_toolset::get_validators_for_training(std::shared_ptr<nnforge::network_schema>)' image_classifier_demo_toolset.o:(.rodata._ZTV29image_classifier_demo_toolset[_ZTV29image_classifier_demo_toolset]+0xb8): undefined reference tonnforge::neural_network_toolset::run_test_with_unsupervised_data(std::vectorstd::shared_ptr<nnforge::output_neuron_value_set, std::allocatorstd::shared_ptr<nnforge::output_neuron_value_set > >&)'
image_classifier_demo_toolset.o:(.rodata._ZTV29image_classifier_demo_toolset[_ZTV29image_classifier_demo_toolset]+0x118): undefined reference to `nnforge::neural_network_toolset::get_samples_for_snapshot(std::shared_ptrnnforge::network_data, std::shared_ptrnnforge::unsupervised_data_reader, unsigned int)'
collect2: error: ld returned 1 exit status
make: *** [../../bin/image_classifier_demo] Error 1

Similar errors for examples/gtsrb-
g++ -o ../../bin/gtsrb gtsrb.o gtsrb_toolset.o -lnnforge_cuda -lnnforge_plain -lnnforge -L../../lib -L/usr/lib -lboost_thread -lboost_regex -lboost_chrono -lboost_filesystem -lboost_program_options -lboost_random -lboost_system -lboost_date_time -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcusparse -lcublas -lcudart -L/usr/lib/x86_64-linux-gnu//lib -lopencv_highgui -lopencv_imgproc -lopencv_core -fopenmp
gtsrb_toolset.o: In function gtsrb_toolset::gtsrb_toolset(std::shared_ptr<nnforge::factory_generator>)': gtsrb_toolset.cpp:(.text+0x13d3): undefined reference tonnforge::neural_network_toolset::neural_network_toolset(std::shared_ptrnnforge::factory_generator)'
gtsrb_toolset.o: In function gtsrb_toolset::get_schema() const': gtsrb_toolset.cpp:(.text+0x1623): undefined reference tonnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x177c): undefined reference to nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)' gtsrb_toolset.cpp:(.text+0x17e3): undefined reference tonnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.cpp:(.text+0x1826): undefined reference to nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)' gtsrb_toolset.cpp:(.text+0x190b): undefined reference tonnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)'
gtsrb_toolset.o:gtsrb_toolset.cpp:(.text+0x1a52): more undefined references to nnforge::network_schema::add_layer(std::shared_ptr<nnforge::layer const>)' follow gtsrb_toolset.o: In functiongtsrb_toolset::prepare_training_data()':
gtsrb_toolset.cpp:(.text+0x301f): undefined reference to nnforge::supervised_data_stream_writer::supervised_data_stream_writer(std::shared_ptr<std::ostream>, nnforge::layer_configuration_specific const&, nnforge::layer_configuration_specific const&, nnforge::neuron_data_type::input_type)' gtsrb_toolset.cpp:(.text+0x3c5e): undefined reference tonnforge::supervised_data_stream_writer::supervised_data_stream_writer(std::shared_ptrstd::ostream, nnforge::layer_configuration_specific const&, nnforge::layer_configuration_specific const&, nnforge::neuron_data_type::input_type)'
gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0x98): undefined reference to nnforge::neural_network_toolset::get_validators_for_training(std::shared_ptr<nnforge::network_schema>)' gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0xb8): undefined reference tonnforge::neural_network_toolset::run_test_with_unsupervised_data(std::vectorstd::shared_ptr<nnforge::output_neuron_value_set, std::allocatorstd::shared_ptr<nnforge::output_neuron_value_set > >&)'
gtsrb_toolset.o:(.rodata._ZTV13gtsrb_toolset[_ZTV13gtsrb_toolset]+0x118): undefined reference to `nnforge::neural_network_toolset::get_samples_for_snapshot(std::shared_ptrnnforge::network_data, std::shared_ptrnnforge::unsupervised_data_reader, unsigned int)'
collect2: error: ld returned 1 exit status
make: *** [../../bin/gtsrb] Error 1

Import pre-trained Tensorflow model.

I'm sorry if this is the wrong section to put this in, but I am wondering, can I use nnForge to import a pre-trained model from Tensorflow? I need a library that can run forward propagation on my net, and nnForge looks promising.

milakov / nnforge Goto Github PK

nnforge's People

Stargazers

Watchers

Forkers

nnforge's Issues

cuda_device_id,D=0

Recommend Projects

Recommend Topics

Recommend Org