Git Product home page Git Product logo

Comments (9)

allanleal avatar allanleal commented on June 18, 2024

It took 2s here :)

Below is the compilation time for the mentioned examples using reverse.hpp and forward.hpp header files:

example-reverse-single-variable-function.cpp

real    0m2.007s
user    0m1.840s
sys     0m0.084s

example-forward-single-variable-function.cpp

real    0m0.666s                                                                                       
user    0m0.584s                                                                                       
sys     0m0.040s

Clearly the forward is a lot faster, which is interesting, given that it uses a lot of template metaprogramming.

I'll see what I can do to speed up the compilation when using reverse.hpp, but I can't promise when! ;)
Hopefully it should be just a few minor things to change (fingers crossed)!

from autodiff.

kewp avatar kewp commented on June 18, 2024

I'd like to use forward.hpp but I find it a bit trickier to use...

Thanks for taking a look!

from autodiff.

allanleal avatar allanleal commented on June 18, 2024

I'd like to use forward.hpp but I find it a bit trickier to use...

I'll be investing quite some time in the future in the foward algorithm, because this is what I'll actually need for my research. But, yes, I'll also support the reverse algorithm and improve it as much as I can.

Let me know what makes the forward mode more difficult for your use case, and I'll try to simplify it.

from autodiff.

kewp avatar kewp commented on June 18, 2024

About forward mode:

If I understand correctly, in forward mode you calculate the derivative of a function rather than a variable. So

dual u = f(x);  // the output variable u

    VectorXd dudx = gradient(f, x);  // evaluate the gradient vector du/dx

where as in reverse you use a variable, i.e.

var y = f(x);                          // the output variable y

    VectorXd dydx = gradient(y, x);        // evaluate the gradient vector dy/dx

Reverse mode is easier to use, I think, because you don't have to worry about how the final variable was built. So for example I could say

var y = f(x) * 2.0 * x;

It's a lot less restrictive. I've finding it quite hard converting my code to use forward mode because of this.

Am I making sense?

from autodiff.

kewp avatar kewp commented on June 18, 2024

To further illustrate, here is some code I am using in reverse mode:

  var w = energy(m, D);

  VectorXvar u(2*m.nn);
        
  int i; for (i=0; i<m.nn; i++)
        {
            u(2*i) = m.one[i];
            u(2*i+1) = m.two[i];
        }
        
    VectorXd dwdu = gradient(w,u);

So my objective variable is w which is calculated from my energy function. Inside is a structure m which contains all the dependent variables, two lists called one and two.

I want to get the derivatives of w w.r.t. these dependent variables.

If I understand correctly, in order to do this with forward mode I would have to construct the energy function to take in each variable, i.e. energy(double var1, double var2, double var3... etc... in order to get the gradient, which I cannot do since the number of dependent variables, well, varies.

from autodiff.

ludkinm avatar ludkinm commented on June 18, 2024

I am making some assumptions: D is a constant that you don't want derivatives wrt (so I've dropped it for now). Your struct m has only one and two which are both (raw) arrays of VectorXd, both of the same length.

I think if you refactor a bit and use std::vector (or std::array if you know the size at compile time rather than run time) you could do:

#include <iostream>
#include <eigen3/Eigen/Core>
#include <autodiff/autodiff/forward/forward.hpp>
#include <autodiff/autodiff/forward/eigen.hpp>

using namespace std;
using namespace Eigen;
using namespace autodiff;
using namespace autodiff::forward;


// a version of the struct - could use std::array if you know the size nn at compile time.
struct M{
  uint nn;
  vector<VectorXdual> one;
  vector<VectorXdual> two;
  M(uint n) : nn(n), one(vector<VectorXdual>(n)), two(vector<VectorXdual>(n)) {}
};


// Just an example energy function
dual energy(M m)
{
  dual out = 0.0;
  for(uint i =0; i < m.nn; ++i){
    out += m.one[i].squaredNorm();
    out -= m.two[i].squaredNorm();
  }
  return out;
}

int main(void)
{
  // construct the struct and fill in the vectors
  M m(10);
  for(uint i =0; i < m.nn; ++i){
    m.one[i] = VectorXdual::Ones(5);
    m.two[i] = 2.0 * VectorXdual::Ones(5);
  }

  dual w;
  VectorXd g;

  g = gradient(energy, wrt(m.one[0]), at(m), w);
  cout << g << endl;		// expecting 2 2 2 2 2

  g = gradient(energy, wrt(m.two[0]), at(m), w);
  cout << g << endl;		// expecting -4 -4 -4 -4 -4

  return 0;
}

Now to get the derivitive wrt each variable requires a loop, but you would need one in the reverse case also I believe

from autodiff.

allanleal avatar allanleal commented on June 18, 2024

Now to get the derivitive wrt each variable requires a loop, but you would need one in the reverse case also I believe

This is true in the forward case, whose strength is to compute directional derivatives, with the derivative along each individual variable being a specific case. To compute directional derivatives, only one function evaluation is then necessary. If one wants this along all unit/variable directions, then it is necessary a function evaluation for each such variable (i.e., for each unit direction, along x1, x2, ..., xn).

Reverse algorithm can compute the gradient of a multi-variable scalar function in one go. However, this does not mean it is going to be more efficient than forward algorithm! Reason: the reverse algorithm usually relies on run-time memory allocation, construction of expression trees at run-time; things that can slow down the computation a lot.

I have an idea of using directional derivative evaluations (and thus using forward automatic differentiation) to compute gradient of scalar functions (multi-variable) without requiring n function evaluations. In case the gradient is constantly updated (e.g., in a optimization calculation), the previous gradient vector would be used to calculate the new one, perhaps requiring just a few extra forward directional derivatives. However, I have no time to implement this now. In fact, this would not exist in autodiff, but in another numerical library I'm planning, which would use autodiff::dual types.

autodiff would then be specialized for forward mode derivative calculations. Those requiring gradient of scalar multi-variable functions would then be able to do so with the other lib, without constructing these expression trees at run-time and deal with dynamic memory allocation (i.e., using var type or equivalent in other automatic differentiation libraries). By relying always in the forward mode algorithm of autodiff, this would imply the use of only stack allocated variables and the use of template meta-programming techniques to optimize the compile-time expression trees (e.g., avoiding temporaries, optimizing trivial expressions and identities, organizing the order of operations, etc.).

from autodiff.

ludkinm avatar ludkinm commented on June 18, 2024

@kewp said

I want to get the derivatives of w w.r.t. these dependent variables.

All I meant was that one would require something like derivative[i] = .... I was not implying it would be more efficient than backward mode - just that accessing the derivatives would require a loop.

autodiff would then be specialized for forward mode derivative calculations.

Are you implying that you will drop backward mode from the library?

from autodiff.

allanleal avatar allanleal commented on June 18, 2024

Are you implying that you will drop backward mode from the library?

var should continue to be maintained here, but users would be informed of a potentially more efficient gradient evaluation algorithm (that actually uses forward algorithm, but not requiring n function evaluations) in that other numerical lib mentioned above.

from autodiff.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.