Comments (6)
This should be what you get when doing (EDIT: wrong way around, whoops, see below.)optx.minimise(..., solver=optx.Newton())
. :)
On line searchs: yup, we support a few of these. In particular backtracking.
from optimistix.
My understanding was that optx.Newton()
was a root finder, not a minimiser (though I guess minimising = root solving for the gradient function...). I'm looking for something like this - there are almost too many Newton algorithms*
Minimum example below showcases some of my confusion - works with BFGS solver but not with Newton (since the function has no root but a minimum).
import optimistix as optx
import jax.numpy as np
gamma = 10
def fn(x,args):
return x[0]**2 + gamma*x[1]**2 + 5
# solver = optx.Newton(1e-3,1e-3)
solver = optx.BFGS(1e-3,1e-3)
res = optx.minimise(fn, solver, y0=np.array([1, 1]), max_steps=int(1e4))
print(res.stats["num_steps"], res.value)
Though I now also see that this works:
import jax
solver = optx.Newton(1e-3,1e-3)
res = optx.root_find(jax.grad(fn), solver, y0=np.array([1, 1]), max_steps=int(1e4))
print(res.stats["num_steps"], res.value)
*please eliminate three. P.S. I am not a crackpot
from optimistix.
We deliberately kept Newton as a root-finder and not a general-purpose minimisation algorithm. Newton's method for minimisation refers to the minimiser you get by applying Newton's method for root-finding to the gradient of the function, as you've done in your example. This means for non-convex problems Newton is happy to converge to any critical point, and not just minima. We've excluded it to deter people from using it without knowing this, and to encourage use of quasi-Newton methods instead. Quasi-Newton methods converge to non-minimal critical points less frequently.
If you want to apply Newton's method regardless, you can apply it to the gradient as you've done. Unfortunately this doesn't support line searches. You could implement a custom Newton solver with line search quite easily by subclassing AbstractMinimiser and providing the true Hessian in the function info. Take a look at the code for any of the existing minimisers for how to do this, and let me know if you run into trouble.
from optimistix.
That all makes sense, thanks for clarifying!
Would Newton solvers that exploit structure (perhaps at the cost of some ease of use) be in scope too? or maybe these are already out there and I'm missing them? e.g. solver for a min. problem with Hessian as arrow matrix. In my mind this could involve reformatting the loss function from fn(x)
to f(sparse_params, dense_params)
and providing a function that can invert the Hessian of the sparse params using the structure.
from optimistix.
I think this should be doable fairly straightforwardly.
First of all, as a similar example for what is already supported, here's a root-find using Newton's method and a diagonal Jacobian. (Which takes the place of the Hessian when doing a root-find rather than a minimisation.)
import lineax as lx
import optimistix as optx
import jax.numpy as jnp
from jaxtyping import Array, Float
def fn(x: Float[Array, "2"], _) -> Float[Array, "2"]:
x0, x1 = x
return jnp.stack([x0**2, (x1 - 3)**2])
solver = optx.Newton(rtol=1e-5, atol=1e-5, linear_solver=lx.Diagonal())
y0 = jnp.array([1., 2.])
sol = optx.root_find(fn, solver, y0, tags=frozenset({lx.diagonal_tag}))
print(sol.value)
Here you can see that we use Lineax to handle linear operators and linear solves.
So in order to support the example you have in mind, you would need to:
- Add an "arrow tag" to indicate the structure of your matrix:
arrow_tag = object()
. (Akin to thelx.diagonal_tag
above.) - Create an
Arrow
solver that knows how to solve matrices with that structure. (Akin to thelx.Diagonal
above.)
Neither of these requires any changes to Lineax itself, as every built-in solver is defined in the exact same way as one that is defined by a user.
Finally, you can of course do your minimisation by doing a root-find on the gradient, as discussed above. (Which is still generally discouraged as @packquickly notes, but that's up to you. :) )
from optimistix.
Thanks, I'll be sure to have a closer look at Lineax too!
from optimistix.
Related Issues (20)
- Error in "optimistix/docs/examples /optimise_diffeq.ipynb" HOT 1
- Issue with vmap `optx.least_squares`. HOT 2
- grad of vmap of function which wraps an optax solver occasionally fails HOT 2
- `BestSoFar...` wanted behavior ? HOT 1
- Non-finite values in the root function are not handled well HOT 2
- Will constrained optimization be supported? HOT 4
- Behavior of BFGS HOT 2
- pytree output structure mismatch error in backprop during vmap HOT 9
- Incompatibility of least_squares and custom_vjp HOT 2
- Extracting intermediate function values/ losses from the solve HOT 4
- Zero implicit gradients when using `ImplicitAdjoint` with CG solver HOT 4
- Would an exhaustive grid search have a place in `optimistix`? HOT 2
- Using `optimistix` with an `equinox` model HOT 2
- Incompatibility with jax 0.4.27 HOT 1
- Possibly of interest HOT 1
- Unexpected behaviour with JAX version HOT 3
- Slow compile of least_squares with large dict parameters HOT 2
- Can't vmap across input using Gauss Newton fwd HOT 11
- Question: errorhandling, BFGS minimization, vmap, and best practices HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optimistix.