Git Product home page Git Product logo

exploring_meta's Introduction

exploring_meta's People

Contributors

kostis-s-z avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

laknath

exploring_meta's Issues

Fix entry point of scripts

Currently the scripts do not work unless you run the experiments through PyCharm and specify the root folder as Sources root

New Meta-World API breaks code

Quick fix is to install a previous build
pip install git+https://github.com/rlworkgroup/metaworld.git@58546ff25211883ca14d036b3516fe63382c6071#egg=metaworld

MAML.adapt() problematic???

MAML-VPG and MAML-PPO seem not to be working well.

MAML-TRPO seems fine.

A difference between these implementations is that the first two use

    learner.adapt(loss)

whereas TRPO uses:

    gradients = torch.autograd.grad(loss, learner.parameters(),
                                    retain_graph=second_order,
                                    create_graph=second_order,
                                    allow_unused=anil)


    learner = l2l.algorithms.maml.maml_update(learner, inner_lr, gradients)

Fix model building

Currently its Python 3.7 dependent since the model is built through a dictionary that assumes order of elements appended.

Possible bugs with the #21 PR

  1. weights[1:].add_(-1.0, dones[:-1])
    ->
    weights[1:] = dones[:-1] - 1.0

  2. p.data.add_(-stepsize, u.data)
    ->
    p.data = u.data + (-stepsize)

        for train_episodes in train_replays:
            new_policy = fast_adapt_trpo_a2c(new_policy, train_episodes, baseline,
                                             fast_lr, gamma, tau, first_order=False, device=device)

->

        for train_episodes in train_replays:
            # Calculate loss & fit the value function
            loss = trpo_a2c_loss(train_episodes, new_policy, baseline, params['gamma'], params['tau'], device)

            # First or Second order derivatives
            gradients = torch.autograd.grad(loss, new_policy.parameters(),
                                            retain_graph=True,  # First order = False
                                            create_graph=True)

            # Perform a MAML update of all the parameters in the model variable using the gradients above
            new_policy = l2l.algorithms.maml.maml_update(new_policy, params['inner_lr'], gradients)

  1. Remove MAML module wrapper to in maml_trpo and anil_trpo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.