Git Product home page Git Product logo

goal_exploration_amigo's Introduction

#Goal-Exploration in sparse reward environemnts

Daniela Stern- Gabsi

github- dgabsi/Goal-Exploration in sparse reward environemnts

(updates were made from danielaneuralx which is my working github)

Reinforcement learning has shown great success in training agents to operate independently in dense reward environments, but in sparse reward environments, agents struggle since there is not enough feedback that leads to improvement.

AMIGo is a novel algorithm that was developed in 2021 in which two agents, a Teacher and a Student, learn to overcome the sparse reward problem by adversarial learning. The Teacher generates goals for the student to reach and be rewarded, while the Student permanence influences teacher reward. Both are also rewarded by the environment. Through acting adversarially, the agent learns to explore and succeed in the environment.

This work proposes two improvements to AMIGo framework. AMIGo-Concurrent accelerates the learning process, and AMIGo-Multiple extends the framework to multiple goals. It tests the results on two challenging Mini-Grid environments and shows that while AMIGo-Concurrent did not outperform AMIGo, AMIGO-Multiple is superior in terms of speed of learning. This research hopes to contribute for more efficient learning in sparse reward environments.

AMIGo paper: Learning with AMIGo: Adversarially Motivated Intrinsic GOals (Campero et al., 2015)(https://arxiv.org/abs/2006.12122))

The code if based on starter files of minigrid and on torch-ac framework. Available at : https://github.com/lcswillems/torch-ac and https://github.com/lcswillems/rl-starter-files The framework was updated and changed to add AMIGo capabilities and to Implement Amigo-Concurrent and Amigo-Multiple.

For reference I used AMIGO implementation AMIGo original code available at: https://github.com/facebookresearch/adversarially-motivated-intrinsic-goals

To run experiments, please run the following:

For AMIGo experiment run:

  • For FourRooms run:
    • python3 train_goal.py --env MiniGrid-FourRooms-v0 --frames 5000000 --procs 40 --lr 0.0002 --lr-teacher 0.0004 --generator_threshold -0.27 --fix_seed
  • For DoorKey run:
    • python3 train_goal.py --env MiniGrid-DoorKey-8x8-v0 --frames 5000000 --procs 40 --lr 0.001 --lr-teacher 0.002 --generator_threshold -0.28 --with_amigo_nets

For AMIGo concurrent run:

  • For FourRooms run:
    • python3 train_goal.py --env MiniGrid-FourRooms-v0 --train_together --frames 5000000 --procs 40 --lr 0.0002 --lr-teacher 0.0004 --generator_threshold -0.27 --fix_seed
  • For DoorKey run:
    • python3 train_goal.py --env MiniGrid-DoorKey-8x8-v0 --train_together --frames 5000000 --procs 40 --lr 0.001 --lr-teacher 0.002 --generator_threshold -0.28 --with_amigo_nets

For AMIGo mutltiple run:

  • For FourRooms run:
    • python3 train_goal_multiple.py --env MiniGrid-FourRooms-v0 --frames 5000000 --procs 40 --lr 0.0002 --lr-teacher 0.0004 --fix_seed
  • For DoorKey run:
    • python3 train_goal_multiple.py --env MiniGrid-DoorKey-8x8-v0 --frames 5000000 --procs 40 --lr 0.001 --lr-teacher 0.002 --with_amigo_nets

To evaluate AMIGo concurrent run:

  • For FourRooms run:
    • python3 evaluate_goal.py --env MiniGrid-FourRooms-v0 --model FOURROOMS_CONCURRENT
  • For DoorKey run:
    • python3 evaluate_goal.py --env MiniGrid-DoorKey-8x8-v0 --model DOORKEY_CONCURRENT --with_amigo-nets

To evaluate AMIGo mutltiple run:

  • For FourRooms run: -python3 evaluate_goal_multiple.py --env MiniGrid-FourRooms-v0 --model FOURROOMS_MULTIPLE
  • For DoorKey run:
    • python3 evaluate_goal_multiple.py --env MiniGrid-DoorKey-8x8-v0 --model DOORKEY_MULTIPLE --with_amigo-nets

For AMIGo baseline run:

  • For FourRooms run:
    • python3 -m train --algo ppo --env MiniGrid-FourRooms-v0 --frames 5000000 --fix_seed
  • For DoorKey run:
    • python3 -m train --algo ppo --env MiniGrid-DoorKey-8x8-v0 --frames 5000000

packages needed :

  • torch
  • numpy
  • pandas
  • gym-minigrid
  • tensorboardX
  • tensorboard
  • pickle
  • gym

goal_exploration_amigo's People

Contributors

danielaneuralx avatar

Stargazers

Hui(Norbert) Zheng avatar Matt Shaffer avatar  avatar

Watchers

 avatar Matt Shaffer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.