Git Product home page Git Product logo

Comments (16)

allanleal avatar allanleal commented on June 9, 2024

from autodiff.

ibell avatar ibell commented on June 9, 2024

from autodiff.

ibell avatar ibell commented on June 9, 2024

Well, I was partly correct. There seems to be a bug in autodiff, or perhaps in my understanding, but I don't understand why the two functions don't yield precisely the same result of zero. In the nested case, the gradient is (0,0)

#include <iostream>

// autodiff include
#include "Eigen/Dense"
#include <autodiff/forward.hpp>
#include <autodiff/forward/eigen.hpp>

template <typename T> auto pow2(T x0) { return x0*x0; }

template <typename T>
auto Rosenbrock(T x0, T x1) {
    return 100.0*pow2(pow2(x0) - x1) + pow2(1.0-x0);
}

// Analytic gradient
Eigen::VectorXd Rosenbrock_exact_gradient(const double x0, const double x1) {
    Eigen::VectorXd o(2);
    o(0) = 200 * (pow2(x0) - x1) * (2 * x0) - 2 * (1 - x0);
    o(1) = 200 * (pow2(x0) - x1) * -1;
    return o;
}

// This one returns zero gradient always (BAD)
autodiff::dual Rosenbrockvec(const Eigen::VectorXdual & x) {
    return Rosenbrock(x[0], x[1]);
}

// This one is fine
autodiff::dual Rosenbrockvec1(const Eigen::VectorXdual& x) {
    return 100.0 * pow2(x(0) * x(0) - x(1)) + pow2(1.0 - x(0));
}

int main(){

    Eigen::VectorXd x0(2); x0 << -0.3, 0.5;
    Eigen::ArrayXd gexact = Rosenbrock_exact_gradient(x0(0), x0(1)); 
    std::cout << gexact << std::endl;
    
    Eigen::VectorXdual x0dual = x0.cast<autodiff::dual>();
    Eigen::VectorXd gnest = autodiff::forward::gradient(Rosenbrockvec, autodiff::wrt(x0dual), autodiff::forward::at(x0dual)); 
    Eigen::VectorXd gnonest = autodiff::forward::gradient(Rosenbrockvec1, autodiff::wrt(x0dual), autodiff::forward::at(x0dual));
    
    std::cout << gnest.array() - gexact.array() << " must be zero\n";
    std::cout << gnonest.array() - gexact.array() << " must be zero\n";
}

from autodiff.

ibell avatar ibell commented on June 9, 2024

Another point is that this seems to work ok in Release builds, but not in debug, which is even weirder.

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

My guess is this has something to do with the pass by copy in auto Rosenbrock(T x0, T x1) and the return type of VectorXdual::operator[]. Checking here.

I'm getting a segmentation fault in release. I'll investigate this later today.

from autodiff.

ibell avatar ibell commented on June 9, 2024

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

I'm still deciding what to do here. It seems that if we stick with expressions being created at compile-time, nesting template functions like your use case won't work. For example:

template <typename T>
T pow2(const T x0) { return x0 * x0; }

When T is double, x0 * x0 produces a double, fine.
When T is dual, x0 * x0 produces a BinaryExpr<MultOp, const dual&, const dual&>, which is not of type T, and thus a compilation failure.

If auto is used as the return type, a segmentation fault is happening; still not clear how to fix this (I had a busy day today and only checking this now after office hours).

I'll try to get this fixed today or tomorrow.

from autodiff.

supersega avatar supersega commented on June 9, 2024

Hi @ibell @allanleal, I took a look at this example and found that problem is this code in

template <typename T>
T Rosenbrock(const T x0, const T x1) {
    return 100.0 * pow2(pow2(x0) - x1) + pow2(1 - x0);
}

For example result of 1 - x0 is expression and we have next function after substitution of template arguments

Xpr pow2(const Xpr x0);

So, simpiest way change Rosenbrock to

template <typename T>
T Rosenbrock(const T x0, const T x1) {
    return 100.0 * pow2<T>(pow2(x0) - x1) + pow2<T>(1 - x0);
}

As we can see the problems related to that fact 'result of an operation is Xpr, not a dual'...
Using templates in such cases, not best practice =) And I see that this problem similar to Eigen https://eigen.tuxfamily.org/dox/classEigen_1_1DenseBase.html#aa73e57a2f0f7cfcb4ad4d55ea0b6414b. Maybe we can add eval() member function.

P.S. I found that we need '<<' operator for expression

from autodiff.

ibell avatar ibell commented on June 9, 2024

Any solution that doesn't involve changes to the guts of Rosenbrock? This is a simple example, real problems involve much more complicated interplay between functions.

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

Hi @ibell ,

Any solution that doesn't involve changes to the guts of Rosenbrock? This is a simple example, real problems involve much more complicated interplay between functions.

It's a simple example that exposes a use case that complicates matters due to how mathematical operations on dual types are implemented (generating expression trees at compile-time).

@supersega 's suggestion to explicitly specify the return type of pow2 is a solution (which I would say should be something temporary, as I can only fix this issue next week when I'm back from a workshop abroad).

We currently know that:

T Rosenbrock(const T x0, const T x1);

fails because the return type is not the same as the type of arguments x0 and x1 when these are dual (more specifically, BinaryExpr<MulOp, const dual&, const dual&>).

I have detected that when we use auto as return type:

auto Rosenbrock(const T x0, const T x1);

a const BinaryExpr<MulOp, const dual&, const dual&>& (a const-ref) is produced and joined with the other expressions created with further math operations; this should not happen! This is causing a segmentation fault. Instead, a copy of that expression node (BinaryExpr<MulOp, const dual&, const dual&>) should be attached to the rest of the expression tree.

It is possible to fix this; but I can't promise a fix this week.

I found that we need '<<' operator for expression

@supersega I've created issue #65 to track this need.

from autodiff.

supersega avatar supersega commented on June 9, 2024

@allanleal Looks like we should avoid perfect forwarding when create expressions, since we need copy if we got lvalue reference(in some cases)

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

Yes! I was too greed on perfect forwarding and this is now hurting with this use case of @ibell ! :)

We might need to check the type of L and R expressions: if Dual, we do perfect forwarding, otherwise we don't (for BinaryExpr, UnaryExpr types).

from autodiff.

supersega avatar supersega commented on June 9, 2024

We might need to check the type of L and R expressions: if Dual, we do perfect forwarding, otherwise we don't (for BinaryExpr, UnaryExpr types).

We also can have references to temporary duals (pass to function by value) and than expression will be invalid. Hope it’s clear what I mean))

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

If temporary dual (as in dual(10) * dual(20)) then L&& and R&& will evaluate to dual&&; in this case we want to make sure the multiplication results in BinaryExpr<MulOp, dual, dual>, and not BinaryExpr<MulOp, dual&&, dual&&>; you're right.

from autodiff.

allanleal avatar allanleal commented on June 9, 2024

As discussed in the PR #68 , I agree with @ibell that that fix only partially solves this issue. However, this is mainly due to C++ inability to have a proper static checker for dangling references. If dual is not used within generic functions (template functions with generic number types), the issue discussed in this thread does not exist.

I'll provide a real type within the next days (or within a week or two) that does not face this issue. real numbers will allow higher-order derivatives, like dual. A type VectorXr will also be implemented (effectively, an Eigen vector of real numbers).

If you have a function VectorXr f(const VectorXr& x) and a direction vector VectorXr v, we'll be able to compute the following directional derivative:

VectorXr dfdv = derivative(f, along(v), at(x));

Derivatives along a given coordinate, or variable, for example, along x[0], just require a direction vector that is a unit vector along that direction. So, the following will (should) also be possible:

VectorXr dfdx0 = derivative(f, wrt(x[0]), at(x));

from autodiff.

ibell avatar ibell commented on June 9, 2024

I very much look forward to this real datatype. I suppose that will be by default a double underlying type, and otherwise you can specify the numerical type?

The directional derivative would be very slick!

from autodiff.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.