Comments (3)
The overflow is due to the gradient definition of tanh
(g / np.cosh(x) ** 2
). It seems that the divisor is too small which leads to numerical overflow. Let me see whether I could implement it in a more stable way. Thanks for the reporting!
For memory issue, @hotpxl could you have a look. I remember there is some weak reference problem in the autograd part before.
from minpy.
Similar issue detected in the rnn perf test. Please check the code: https://github.com/dmlc/minpy/blob/rnn_perf/examples/nn/rnn_test/rnn_minpy_perf.py (you should copy the file to examples/nn folder to run)
I did some math on how much memory should we use:
Input: 256, Hidden: 2560, Out: 1, seq_len: 30, batch_size: 100
Weights:
Wx = 256 * 2560 * 4 / 1024 / 1024 = 2.5M
b = 2560 * 4 / 1024 / 1024 = 0.01M
Wh = 2560 * 2560 * 4 / 1024 / 1024 = 25M
hb = 0.01M
Wout = 0.01M
Sum = 27.03M
Activation:
Input: 100 * 256 * 30 * 4 / 1024 / 1024 = 0.3M
Activation: 100 * 2560 * 30 * 4 / 1024 / 1024 = 29.3M
Sum: 29.6M
When involving BP, just double the space for error derivative and gradients for the weigts, and if we have momentum, just add another pie of weights. Total memory needed should be about:
29.6 * 2 + 27.03 * 3 = 59.2 + 81.08 = 140M
The minpy example seems run more than 1500M on my device.
from minpy.
This should been solved in #112 #117 .
from minpy.
Related Issues (20)
- second grad error HOT 5
- Training/inference distinction and multiple-output support in minpy.core.Function HOT 1
- Inputting auxiliary states to MXNet symbol via minpy.core.Function HOT 3
- minpy.core.Function handling top-down gradients w.r.t. multiple outputs HOT 3
- Typo Error in the Autograd Section of MinPy Tutorial HOT 1
- Can I create a variable shared by forward and back propagation in customop('numpy')? HOT 1
- [Help]How can we resume from an already saved parameters ? HOT 1
- How to install minpy and opencv correctly ? HOT 3
- How to specify a function's gradient function defined by users just like 'Autograd' ? HOT 3
- MNIST Solver cannot check accuracy when y is onehot encoded HOT 1
- reverse indexing does not work
- Cannot find 'engine' branch for minpy package HOT 3
- numpy.linalg missing
- Try to convert numpy code ,failed,Help!!! HOT 1
- minpy doesn't support astype function!!!
- How to use np.where HOT 1
- minpy.numpy.array dose not support len()
- np.where has not the same usage as numpy
- cannot use minpy.numpy.newaxis?
- concatenate throw exception "Operator _copyto inferring shapes failed." when input array is empty
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from minpy.