Continual Auxiliary Task Learning

This repo contains the code for reproducing the results published in Continual Auxiliary Task Learning. This paper was published in NeurIPS2021.

Authors

Abstract

Learning auxiliary tasks, such as multiple predictions about the world, can provide many benefits to reinforcement learning systems. A variety of off-policy learning algorithms have been developed to learn such predictions, but as yet there is little work on how to adapt the behavior to gather useful data for those off-policy predictions. In this work, we investigate a reinforcement learning system designed to learn a collection of auxiliary tasks, with a behavior policy learning to take actions to improve those auxiliary predictions. We highlight the inherent non-stationarity in this continual auxiliary task learning problem, for both prediction learners and the behavior learner. We develop an algorithm based on successor features that facilitates tracking under non-stationary rewards, and prove the separation into learning successor features and rewards provides convergence rate improvements. We conduct an in-depth study into the resulting multi-prediction learning system.

Detailed instructions:

If one was able to run the code as is, the parallel/toml_parallel.jl file is the entry point. This parallelizes our sweeps and works with all the listed config files.

Downloading and installing Julia

Julia can be found and downloaded here. You can also find details on the language in the documentation. We only guarantee this code works for versions up-to v1.5.x.

Reproducing results

Install julia version 1.6.x from https://julialang.org/downloads/
Add to path
cd to the curiosity directory
Change branch to NeurIPS2021
Instantiate the project:

%> julia --project
julia> ]instantiate

To run an experiment from its config:

julia --project parallel/toml_parallel.jl <<config_file.toml>>

This will run the experiment and place the results in the folder defined in the toml. To run on a larger cluster is possible, but requires more details. You should see Reproduce.jl for an example/details.

Config Files Mapped to Experiments:

Figure 2:
Figure 3:
Figure 4:

Appendix empirical results:

Figure 5:
Figure 6:
Figure 7:
Figure 8:

Analyzing data with Reproduce.jl and ReproducePlotUtils.jl.

You can see details on how to analyze data with Reproduce.jl and ReproducePlotUtils.jl in the plotting directory. The overall procedure is to:

Construct an ItemCollection from the experiment directory. This is all the settings run in the experiment
search for the subset of the settings you want using search
then plotting this subset

There are various utilities to find the best parameter setting given a set of sweep arguments.

matthewmcleod / curiosity Goto Github PK

curiosity's Introduction

Continual Auxiliary Task Learning

Authors

Abstract

Detailed instructions:

Downloading and installing Julia

Reproducing results

Config Files Mapped to Experiments:

Appendix empirical results:

Analyzing data with Reproduce.jl and ReproducePlotUtils.jl.

Acknoledgements

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent