kostis-s-z / exploring_meta Goto Github PK

Experiments on Model-Agnostic Meta-Learning on Few-Shot Image Classification and Meta-RL (Meta-World)

License: MIT License

Python 100.00%

exploring_meta's Introduction

Hey there 🙃

Machine Learning Engineer with a Computer Science background, experience in Deep Learning research and building ML platforms. Concerns on privacy, sustainability, ethics 🦜

📚 Publications

🎯 Projects, Presentations, Blog Posts

🎓 M.Sc Thesis on Meta-Learning

exploring_meta's People

Contributors

Stargazers

Watchers

Forkers

laknath

exploring_meta's Issues

Fix not installing mujoco

MAML/ANIL - VPG is unstable. Move to its own branch

Fix entry point of scripts

Currently the scripts do not work unless you run the experiments through PyCharm and specify the root folder as Sources root

New Meta-World API breaks code

Quick fix is to install a previous build
pip install git+https://github.com/rlworkgroup/metaworld.git@58546ff25211883ca14d036b3516fe63382c6071#egg=metaworld

MAML.adapt() problematic???

MAML-VPG and MAML-PPO seem not to be working well.

MAML-TRPO seems fine.

A difference between these implementations is that the first two use

    learner.adapt(loss)

whereas TRPO uses:

    gradients = torch.autograd.grad(loss, learner.parameters(),
                                    retain_graph=second_order,
                                    create_graph=second_order,
                                    allow_unused=anil)


    learner = l2l.algorithms.maml.maml_update(learner, inner_lr, gradients)

Upgrade to latest L2L

Add download parameter to mini_imagenet

Maybe try Omniglot?
Is there a way to know its not just a hyperparameter tuning issue?
Make sure the "features" (the body of the network) does actually change (use CCA withing training)

Disable done signal and find another way to terminate episode in Runner

Complete unit testing for MAML & ANIL

Has been tested & works as expected

MAML-PPO
MAML-TRPO
ANIL-TRPO
ANIL-PPO

Make code available to repo to test.

weights[1:].add_(-1.0, dones[:-1])
->
weights[1:] = dones[:-1] - 1.0
p.data.add_(-stepsize, u.data)
->
p.data = u.data + (-stepsize)

        for train_episodes in train_replays:
            new_policy = fast_adapt_trpo_a2c(new_policy, train_episodes, baseline,
                                             fast_lr, gamma, tau, first_order=False, device=device)

        for train_episodes in train_replays:
            # Calculate loss & fit the value function
            loss = trpo_a2c_loss(train_episodes, new_policy, baseline, params['gamma'], params['tau'], device)

            # First or Second order derivatives
            gradients = torch.autograd.grad(loss, new_policy.parameters(),
                                            retain_graph=True,  # First order = False
                                            create_graph=True)

            # Perform a MAML update of all the parameters in the model variable using the gradients above
            new_policy = l2l.algorithms.maml.maml_update(new_policy, params['inner_lr'], gradients)

Remove MAML module wrapper to in maml_trpo and anil_trpo

Fix cherry-rl dependency

My fork of cherry-rl which enables saving success metrics is not compatible with Particles2D