Comments (20)
from knet.jl.
Knet7 had a cpu conv implementation written by Onur Kuru in
https://github.com/denizyuret/Knet.jl/blob/master/deprecated/src7/util/conv_pool_cpu.jl
This has not been ported / tested on Knet8 yet, it is on the todo list.
On Wed, Nov 2, 2016 at 7:03 PM niczky12 [email protected] wrote:
Just wondering, but is there a way to use conv and pool` without a GPU?
I'm running a windows machine and even though I have a nvidia card
installed, I failed to install CUDA. If any of you have tips on how to get
this working that would be appreciated.Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#33, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABvNpvXzqA1PuYAwBxoVjvwyTmELYdQ4ks5q6LRNgaJpZM4Kna_F
.
from knet.jl.
Some experimental code in the cpuconv branch. Not all padding/stride options supported. Slow and not fully tested.
from knet.jl.
Onur's latest cpu conv code: https://github.com/kuruonur1/CNN.jl
from knet.jl.
This is incorporated in the latest master. Can try to make it more efficient. We should also find open source kernels to try, from ArrayFire, Nervana etc. to replace cuDNN and to inform more efficient CPU implementations. I am keeping this issue open for ongoing work.
from knet.jl.
Mocha.jl has CPU implementations, should check out the speed.
from knet.jl.
Working on integrating Mocha CPU conv/pool under mochaconv branch.
from knet.jl.
Mocha cpu conv/pool kernels have been integrated. They utilize multiple cores using openmp. I don't think the cpu conv/pool speed is going to get much better, they are about 10x slower than gpu. It may be possible to have a single im2col operation instead of one for each image.
I am leaving this issue open for now to see if (1) we can find better cpu kernels, (2) we can find better open source gpu kernels to replace cudnn.
from knet.jl.
For CPU, you can look at what we did for n-dimensional convolutions (we used conv2
when we should've of used convnd
) in Seep.jl here . We are currently looking into using CudaNative.jl and llvm for julia-0.6
to produce efficient gpu kernels.
from knet.jl.
That's great news! I would love to try some open source gpu kernels when you guys have something ready to test. I haven't looked at CudaNative yet, but if I can help with benchmarking etc. let me know.
For CPU Onur's implementation also used conv2 but it was too slow. In the latest release I adapted the C++ kernels from Mocha.jl which use openmpi and work pretty fast. See Knet.jl/prof/conv.jl for some benchmarking results, we should compare with the Seep.jl implementation.
from knet.jl.
Thanks for the CPU references. I had meant we extended the name conv2
when in fact it is an N-dimensional implementation. We avoided doing a im2col
operation because it uses too much memory when building the graph. We haven't done much on benchmarking, and we are also very limited in our ability release code updates.
You should also look at ImageFiltering.jl. Tim Holy has made a lot of optimizations for doing efficient convolutions on images with im_filter
. No gradients though.
from knet.jl.
The latest benchmarks from @ilkarman (https://github.com/ilkarman/DeepLearningFrameworks) show our cpu implementation to be quite inefficient. There is a new thread in (https://discourse.julialang.org/t/on-machine-learning-and-programming-languages/7574/30) suggesting alternatives. We need volunteers to reimplement cpu convolution operations using Intel MKL.
Dynet-benchmarks by @ilkerkesen also show a similar trend for our cpu implementation of the cudnn rnn kernels. Knet compares very well to Chainer and Dynet on the GPU but the cpu performance is lacking. A similar volunteer effort is needed there.
from knet.jl.
Also see fb.me/83w6aHEJO
With Onur's summary from 3/28/16:
Convolution icin Fourier veya Winograd Transform kullanmislar. Birkac network configurasyonu icin Im2col + gemm'e (bizim kullandigimiza) gore 2'den 4 kata kadar hizli calistigini gosteriyorlar.
Repo surada: https://github.com/Maratyszcza/NNPACK
C'de yazilmis ve derlenip Julia'dan cagiriabilir. Fakat su anda iki limitasyonu var:
- Only convolutional layers without stride are currently supported (stride=1?)
- Only 2x2 pooling is currently supported
from knet.jl.
Hey, jumping into the thread here. Any current plans for addressing the speed on CPU when using Knet? I really like Knet as it's really native to julia and nice to work with. However, I'm stuck with CPU for a while and would like to get MXNet.jl performance if possible.
from knet.jl.
Maybe the code from https://github.com/CNugteren/CLBlast
could be helpful, as an alternative to BLAS, CLBLAS. This code supports FP16 compute. For convolutions using matrix multiplies https://arxiv.org/abs/1704.04428.
from knet.jl.
https://github.com/intel/mkl-dnn may be a good solution?
from knet.jl.
https://discourse.julialang.org/t/knet-vs-flux-etc/17057/10?u=denizyuret shows that Flux is faster in CPU convolutions. Mike Innes says: "(Flux uses) NNlib’s pure-Julia convolutions vs Knet’s threaded C++ ones, although NNlib is soon to move to NNPACK".
from knet.jl.
There is a julia wrapper for NNPACK intended to be used in NNlib.jl for Flux.
https://github.com/avik-pal/NNPACK.jl
The problem with NNPACK is that for small batchsizes it is slower compared to NNlib.jl's native Julia conv.
FluxML/NNlib.jl#67 (comment)
Similarly, NNPACK is also slower compared to Pytorch's conv in small batchsizes.
pytorch/pytorch#2826 (comment)
Apparently they don't utilize NNPACK for now. But if they do, it seems they will resort to a heuristic based approach to switch between default conv and NNPACK implementation depending on the input parameters such as batchsize and number of channels.
There other problems with NNPACK.
-It does not support 3D conv and pooling:
Maratyszcza/NNPACK#138 (comment)
-It does not support strided convolution for training
Maratyszcza/NNPACK#139 (comment)
from knet.jl.
@cemilcengiz, we are trying to pass CI tests on windows, ARM etc with @ianshmean and the CPU conv kernels are causing trouble. (1) Is NNlib's pure Julia implementation comparable in speed to our CPU kernels? (2) Does NNPACK require any compiling or library installations? (3) Any progress/improvements in any of the solutions mentioned above (mkl-dnn, Seep.jl, ImageFiltering.jl), CLBlast.
My current concern is for ease of installation rather than speed. So if it is not too much slower, I'd like to go with a pure Julia solution.
from knet.jl.
#494 switches to NNlib for CPU conv/pool.
from knet.jl.
Related Issues (20)
- dropout seeds the global RNG
- MethodError: no method matching LinearIndices HOT 2
- ERROR: LoadError: KeyError: key :nw not found
- missing docstring error for "File I/O" section
- Making a new CUDA 3 compatible release? HOT 7
- Create a DeepMap.jl package
- Knet 1.4.7: libknet8 library not found HOT 4
- LoadError: Failed to precompile Knet [1902f260-5fb4-5aff-8c31-6271790ab950] to my julia directory HOT 3
- TagBot trigger issue HOT 2
- load/save problem with CuArrays in tutorial/60.rnn.ipynb HOT 2
- Derivative of a Function That Includes @diff Macro HOT 4
- R1 Regularization HOT 9
- type DataType has no field mutable HOT 2
- Quick Start tutorial notebook is broken HOT 1
- Cannot locate artifact 'libknet8' for aarch64-linux-gnu-libgfortran5-cxx11-libstdcxx29-julia_version+1.7.1 in Docker Container on Apple Silicon HOT 9
- Fix deprecations from MLDatasets in tutorial notebooks
- Problem with conv4 on gpu HOT 4
- UndefVarError: accuracy not defined in tutorial 30.lin.ipynb HOT 1
- CuArray{Float32} in ytype raised error HOT 4
- how to use the resnet modules in example folder
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from knet.jl.