Git Product home page Git Product logo

leco's Introduction

KakaoBrain conf

LECO: Learnable Episodic Count (NeurIPS 2022)

This is an official implementation of LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward [arxiv]

This repo is extended from the original Sample Factory by Aleksei Petrenko et al.

Requirements

  • torch==1.9.0
  • gym-minigrid==1.0.3
  • deepmind-lab @ file:///tmp/dmlab_pkg/deepmind_lab-1.0-py3-none-any.whl

Training Scripts

Example command for DMLab task

Two nodes with 4 V100 gpus on each node

python -m dist.launch --nnodes=2 --node_rank=0 --nproc_per_node=4 --master_addr=$MASTER_ADDR -m sample_factory.algorithms.appo.train_appo --cfg=lstm_dmlab_single_leco --train_dir=/your/train/directory --experiment=your_experiment_name
python -m dist.launch --nnodes=2 --node_rank=1 --nproc_per_node=4 --master_addr=$MASTER_ADDR -m sample_factory.algorithms.appo.train_appo --cfg=lstm_dmlab_single_leco --train_dir=/your/train/directory --experiment=your_experiment_name

Example command for MiniGrid task

Single node with 2 V100 gpus

python -m dist.launch --nnodes=1 --node_rank=0 --nproc_per_node=2 --master_addr=$MASTER_ADDR -m sample_factory.algorithms.appo.train_appo --cfg=lstm_MiniGrid-ObstructedMaze-Full_leco --train_dir=/your/train/directory --experiment=your_experiment_name

Citation

@inproceedings{jo2022leco,
 author = {Jo, Daejin and Kim, Sungwoong  and Nam, Daniel and Kwon, Taehwan and Rho, Seungeun and Kim, Jongmin and Lee, Donghoon},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
 pages = {30432--30445},
 publisher = {Curran Associates, Inc.},
 title = {LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward},
 url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/c43b2989b1ba055aa713a4abbe4a8b05-Paper-Conference.pdf},
 volume = {35},
 year = {2022}
}

Contact

Daejin Jo, [email protected]
Daniel Wontae Nam, [email protected]

leco's People

Contributors

twidddj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

leco's Issues

Follow up the code

Hi, this is a very interesting work. May I ask when the release of the code is scheduled? Thanks a lot.

Debug help

  1. Resolved with the right version of torch (1.9.0)

[Resolved]
Hi @twidddj , I was trying to reproduce the results from the paper for minigrid envs, followed the instructions in readme and got the following error.

python -m dist.launch --nnodes=1 --node_rank=0 --nproc_per_node=2 --master_addr=$MASTER_ADDR -m sample_factory.algorithms.appo.train_appo --cfg=lstm_MiniGrid-ObstructedMaze-Full_leco --train_dir=/your/train/directory --experiment=your_experiment_name

Kindly help debug the same!

Exception in thread Thread-6:
Traceback (most recent call last):
  File "/home/user/.conda/envs/py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/home/user/.conda/envs/py38/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/working/leco/sample_factory/algorithms/appo/learner.py", line 1965, in _train_loop
    self._process_training_data(data, wait_stats)
  File "/home/user/working/leco/sample_factory/algorithms/appo/learner.py", line 1890, in _process_training_data
    train_stats = self._train(buffer, batch_size, experience_size)
  File "/home/user/working/leco/sample_factory/algorithms/appo/learner.py", line 1209, in _train
    loss.backward(retain_graph=True)
  File "/home/user/.conda/envs/py38/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/user/.conda/envs/py38/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [768, 64, 6, 6]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.