Git Product home page Git Product logo

curiosity's Introduction

Continual Auxiliary Task Learning

This repo contains the code for reproducing the results published in Continual Auxiliary Task Learning. This paper was published in NeurIPS2021.

Authors

Abstract

Learning auxiliary tasks, such as multiple predictions about the world, can provide many benefits to reinforcement learning systems. A variety of off-policy learning algorithms have been developed to learn such predictions, but as yet there is little work on how to adapt the behavior to gather useful data for those off-policy predictions. In this work, we investigate a reinforcement learning system designed to learn a collection of auxiliary tasks, with a behavior policy learning to take actions to improve those auxiliary predictions. We highlight the inherent non-stationarity in this continual auxiliary task learning problem, for both prediction learners and the behavior learner. We develop an algorithm based on successor features that facilitates tracking under non-stationary rewards, and prove the separation into learning successor features and rewards provides convergence rate improvements. We conduct an in-depth study into the resulting multi-prediction learning system.

Detailed instructions:

If one was able to run the code as is, the parallel/toml_parallel.jl file is the entry point. This parallelizes our sweeps and works with all the listed config files.

Downloading and installing Julia

Julia can be found and downloaded here. You can also find details on the language in the documentation. We only guarantee this code works for versions up-to v1.5.x.

Reproducing results

  1. Install julia version 1.6.x from https://julialang.org/downloads/
  2. Add to path
  3. cd to the curiosity directory
  4. Change branch to NeurIPS2021
  5. Instantiate the project:
%> julia --project
julia> ]instantiate
  1. To run an experiment from its config:
julia --project parallel/toml_parallel.jl <<config_file.toml>> 

This will run the experiment and place the results in the folder defined in the toml. To run on a larger cluster is possible, but requires more details. You should see Reproduce.jl for an example/details.

Config Files Mapped to Experiments:

  • Figure 2:
  • Figure 3:
  • Figure 4:

Appendix empirical results:

  • Figure 5:
  • Figure 6:
  • Figure 7:
  • Figure 8:

Analyzing data with Reproduce.jl and ReproducePlotUtils.jl.

You can see details on how to analyze data with Reproduce.jl and ReproducePlotUtils.jl in the plotting directory. The overall procedure is to:

  • Construct an ItemCollection from the experiment directory. This is all the settings run in the experiment
  • search for the subset of the settings you want using search
  • then plotting this subset

There are various utilities to find the best parameter setting given a set of sweep arguments.

Acknoledgements

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.