Comments (3)
@qipeng has solved this issue in #741.
from caffe.
Hi
I might be mistaken, but I dont think your interpretation of the Bengio et al. paper is right. They show that the parameter update (Formula 7) is the same as the one in the regular momentum (Formula 5), except for different coefficients. These coefficients however are then not the same as those used to update the velocity (Formula 6) (which would make if completely the same). That's what makes the difference (although probably a rather slight one?).
from caffe.
Hi @pwohlhart , due to the limitation of the current gradient based solver that it only evaluates the gradient once and updates the parameters once every iteration, my implementation is slightly different from (and perhaps slightly faster than) the original NAG.
Each iteration of the standard NAG can be viewed as:
- Update the current parameters to a "future point" with the current velocity
- Evaluate the gradient at that point
- "Undo" the update
- Update the velocity with the gradient at the future point
- Update the parameters with the new velocity
Due to the aforementioned limitations, my implementation is:
- Evaluate the gradient at a "future point"
- Add a negative velocity to the parameter update
- Update the velocity, and add the new velocity to the parameter update (multiplied by
1+momentum
to update the parameters to the "future point" of the next iteration) - Update the parameters with their corresponding updates
Here several parameter updates in the original algorithm are consolidated.
The only slight difference between this method and the standard NAG is that the parameter states between iterations are always the "future point" of that iteration, i.e. theta + momentum * velocity
. This shouldn't cause too big of a problem as the gradient and/or learning rate are usually close to zero when the optimization approaches its end.
from caffe.
Related Issues (20)
- BUG: error happens while building the project using cmake, if without preinstall `gflags`. HOT 1
- Makefile
- import error: segment fault when import caffe
- Segmentation fault (core dumped) when creating imageset
- MSBuild Error
- DeleteMe
- Glib 3.4.30 not found HOT 1
- Error MSB6006: "cmd.exe" exited with code -1073741 515 HOT 2
- blob.hpp dimension check code problem
- Is it possible to use OpenCL on FreeBSD without using ROCm?
- How to build Caffe(OpenCL) on Linux from source code? HOT 1
- Caffe(OpenCL) Error: ordered comparison between pointer and zero ('int32_t *' (aka 'int *') and 'int') HOT 1
- Failed inference with nyud-fcn32s-hha
- ю
- caffe installation HOT 1
- Assessment of the difficulty in porting CPU architecture for caffe
- How to add new layer to caffe like HardSigmoid or Resize HOT 1
- module 'caffe' has no attribute 'set_mode_cpu'
- `GLOG_LIBRARYRARY_DIRS` appears to be in error HOT 1
- Could not find url for MSVC version = 1939 and Python version =
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffe.