anassinator / ilqr Goto Github PK

Iterative Linear Quadratic Regulator with auto-differentiatiable dynamics models

License: GNU General Public License v3.0

Python 100.00%

trajectory-optimization differential-dynamic-programming theano dynamics-models auto-differentiation model-predictive-control model-predictive-controller control-systems cartpole pendulum

ilqr's People

Contributors

Stargazers

Watchers

ilqr's Issues

Dynamics definition in a vectorized way

Hi,

I have tried to implement cartpole's dynamics in a vectorized way (in cartpole.py) but it failed probably due to my lack of knowledge of Theano.

Basically, instead of defining them this way:

    f = T.stack([
        x + x_dot * dt,
        x_dot + x_dot_dot * dt,
        T.sin(next_theta),
        T.cos(next_theta),
        theta_dot + theta_dot_dot * dt,
    ])

define them somehow like this:

A = np.array([[1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 1]])
f = T.dot(A, [x, x_dot, sin_theta, cos_theta, theta_dot]) + T.stack([0, 0, T.sin(next_theta), T.cos(next_theta), 0]) + dt*T.stack([x_dot, x_dot_dot, 0, 0, theta_dot_dot])

I get the following error when running cartpole's notebook:

Thank you for your help !

Result after constraints

As the constraints in the input are introduced after fit() once I get this new constraints and put them in the dynamical model of the truck it won't reach the desired goal position at all, that is why I do not understand how you can solve the problem with constraints. My problem statement looks like:

def cost(self):

     state_size = 4  # [x,y,theta,alpha]
     action_size = 1  # [steer]

     # The coefficients weigh how much your state error is worth to you vs
     # the size of your controls. You can favor a solution that uses smaller
     # controls by increasing R's coefficient.
     Q = 0.5 * np.eye(state_size)
     R = 0.5 * np.eye(action_size)

     # This is optional if you want your cost to be computed differently at a
     # terminal state.
     Q_terminal = np.array([[100.0, 0.0, 0.0, 0.0], [0.0, 100.0, 0.0, 0.0],
                            [0.0, 0.0, 100.0, 0.0], [0.0, 0.0, 0.0, 100.0]])

     # State goal is set to a position of 1 m with no velocity.
     x_goal = np.array([68.70061495018659,
                        24.387377335723006,
                        0.7667067876774367,
                        -0.4414311718278174])

     # NOTE: This is instantaneous and completely accurate.
     cost = QRCost(Q, R, Q_terminal=Q_terminal, x_goal=x_goal)
     return cost

 def ilqr(self):
     x0 = np.array([40.00000000000024,
                    14.05,
                    1.5622809766054426,
                    0.008515350184557442])  # Initial state.
     us_init = np.array([...])
     N = len(us_init)
     cost = self.cost()
     dynamics = self.dynamic_model()
     ilqr = iLQR(dynamics, cost, N)
     xs, us = ilqr.fit(x0, us_init, n_iterations = 100)
     us = constrain(us, -0.55, 0.55)
     for i in range(1,len(xs)):
         xs[i][0] = xs[i-1][0] + np.cos(xs[i-1][2]) * self.dt
         xs[i][1] = xs[i-1][1] + np.sin(xs[i-1][2]) * self.dt
         xs[i][2] = xs[i-1][2] + (1/3.6)*np.tan(us[i-1]) * self.dt
         xs[i][3] = xs[i-1][3] - ((1/12.086)*np.sin(xs[i-1][3])+(1/3.6)*np.tan(us[i-1])) * self.dt

trajectory tracking problem

Hi anassinator, great work and thanks for the contribution. I am wondering if this can handle problems with with one step cost being L(X; u) = (X-X*)TQ(X-X*)+ (u-u*)TR(u-u*), where X* and u* are my target trajectory? @anassinator

Theoretical question

I have a reference trajectory that I want to follow but I want as well to reduce some input in my truck (lets say the steering of the vehicle in order to smooth a little bit the original trajectory). The dynamics of my truck are non-linear so every time step I linearize them based on my current trajectory and current controls. I have read that when you linearize the resulting state and control vector are no longer x and u, but x* = x-x_linearized and u*=u-u_linearized. So my question is if what I say is true, how can I adapt my cost function to follow trajectory but reduce steering? Does it make sense something like:
J= np.dot(state_vector, state_vector)+np.dot(control_vector+current_control, control_vector+current_control)
Where current_control is the control obtained in the last iteration so as we are linearizing based on them if we do u*+u_linearized I obtain u which is what I want to linearize. Does it make sense?

time varying MPC

I am trying to create a MPC to follow a path. So the cost function is time varying. I was reading through the code and you commented the cost:

"If your cost or dynamics are time dependent, then you might need to shift their internal state accordingly."

I think I understand what you mean by this but I am unsure of how I would implement such a thing. Do you have any insight on how to do this?

how to deal with the dynamics with a delay?

I am solving some problem, which is a dynamics with a large delay. Can I use ilqr solve it? How?

alpha multiplication with k and K

Hi anassinator

First off, thanks a lot for a beautifully written library!

I have noticed that in the following line you multiply alpha with (k[i] + K[i]...), whereas in the paper eq 12 (as well as in a comment of yours in the lines above) it is mentioned that alpha should only be multiplied with k[i].

ilqr/ilqr/controller.py

Line 208 in 2184c48

us_new[i] = us[i] + alpha * (k[i] + K[i].dot(xs_new[i] - xs[i]))

Is this done so on purpose?

Coding Problem

I met a problem that shows the error as :too many indices for array!In my code,the dimension of state is 12,the dimension of action is 3.Hope someone can help me.

TypeError: Unknown parameter type: <class 'theano.tensor.var.TensorVariable'> in the CartPole example

Hi!

I tried to run your CartPole Jupyter notebook example. However, I run into this problem:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-aa80e5b7db85> in <module>()
      1 dt = 0.05
      2 pole_length = 1.0
----> 3 dynamics = CartpoleDynamics(dt, l=pole_length)

6 frames
/usr/local/lib/python3.7/dist-packages/theano/compile/function/pfunc.py in _pfunc_param_to_in(param, strict, allow_downcast)
    541     elif isinstance(param, In):
    542         return param
--> 543     raise TypeError(f"Unknown parameter type: {type(param)}")
    544 
    545 

TypeError: Unknown parameter type: <class 'theano.tensor.var.TensorVariable'>

I run it in colab. I installed your package this way: !pip install -U git+https://github.com/anassinator/ilqr.git. I tried to force install Theano == 1.0.4, but it didn't help.

Can you please help me figure this out?

Thanks,
Piotr

Add constraints to control and/or state

Hi,

Thanks a lot for the great repo! Clean implementation of iLQR and easy to understand, it's really useful.

I have a question though: would it be possible to constrain the possible values of the control input u that is returned by the optimization procedure? Like we want to pick the optimal control trajectory, but inside of some bounds on the values that are outputted by the algorithm. I'm happy to implement it myself if you know where/how it should be done, but I can't figure it out yet.

Let me know if you have any idea how to do that!

Non-linear dynamics -- Trajectory Following

Hello and thanks for the interesting package you've created!

I wanted to ask how can I use it for a dynamic model that depends explicitly on time

i.e x_dot = (t-u)*x^2

where u is the input, t is time and x is the state.

Moreover, can your package be used for Trajectory following?

I.e mininimize -> J = sum_i=0^N ( x(i) - x_ref(i) )^T Q ( x(i) - x_ref(i) ) + ( u(i) - u_ref(i) )^T Q ( u(i) - u_ref(i) )

Thanks in advance!

anassinator / ilqr Goto Github PK

ilqr's People

Contributors

Stargazers

Watchers

Forkers

ilqr's Issues

Dynamics definition in a vectorized way

Result after constraints

trajectory tracking problem

Theoretical question

time varying MPC

how to deal with the dynamics with a delay?

alpha multiplication with k and K

Coding Problem

TypeError: Unknown parameter type: <class 'theano.tensor.var.TensorVariable'> in the CartPole example

Add constraints to control and/or state

Non-linear dynamics -- Trajectory Following

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent