inoryy / reaver Goto Github PK

Reaver: Modular Deep Reinforcement Learning Framework. Focused on StarCraft II. Supports Gym, Atari, and MuJoCo.

License: MIT License

Python 100.00%

artificial-intelligence deep-learning machine-learning reinforcement-learning actor-critic tensorflow pysc2 starcraft-ii starcraft2 deepmind

reaver's People

Contributors

Stargazers

Watchers

Forkers

moshebeutel amoliu ituco mushroom1116 jevmar moon0823 taylor-liu suchot ericonaldo morningsky jonarain fd-mingjie chenlheng johnyfeng hsywhu shyamalschandra jaymgrayson chunde darknessbeforedawn wowoyang stjordanis mbyase sharmer156 xucongyong samangel93 gmhou tonyle9 apollozhang2012 ml-lab snowfeet ygshuwu bluepine drag0nglass xuexixuexihaha wwxfromtju joshgavinhong wanjinchang whattress brandonliuli spearous zhangluoyang shadowkun we1l1n 2to3rdpwr jinjunqi collector-m yanxiaobin-ben empereurcc wozqhl haolu1994 jiyulongxu yueyedeai demonxjj kartikmehta09 badaben yiningzeng victorleelk desire142 mateusmatiazzi cavallonechen hatsu3 scotty1100110 paulchou0309 9578577 verystrongjoe dudwojae liucongalbin xinyiys qwertymaster00 athinkingneal caiyangcy viehzeug drwxyh kubilaykagankomurcu guitar64 liuiia huoliangyu dshen007 woshiqchi owenpanqiufeng ramonpereira ali-i-abbas hmf0103 danielqujun cobase2010 axia75 tylersarah

reaver's Issues

ACKTR

Investigate if tensorflow/kfac works correctly
Implement the algorithm

Issue running StarCraft II agent on CPU only setup

First, if I run python -m reaver.run --env MoveToBeacon --agent a2c --n_envs 4 2> stderr.log I get UnimplementedError (see above for traceback): Generic conv implementation only supports NHWC tensor format for now. So I changed line67 in run.py into if not int(args.gpu)
And after that, this problem seems to be solved, but I got another Problem whenever the game loading is done:ValueError: Argument is out of range for 12/Attack_screen (3/queued [2]; 0/screen [0, 0]), got: [[1], [8, 40]] The Argument that is out of range is not the same each time. So is there something I overlooked? Thx

Agents not training in final step

Hi inoryy,

I am having an issue with the final step where the agents train in the reaver colab.

Everything loads fine until the pygames part. It just shows that I have deleted tcmalloc library and stops running shortly there after.

Any ideas on how to fix?

Agent occasionally stops moving

Most likely due to faulty masking of unavailable actions. Maybe clipping probs to 1e-6 is a bad idea?

Related to #7

I train under the ubuntu

Dear Inorry,
Thanks your sharing, I can learn a lot. Now I have trained four minigames and they are consistent with your results. But the other three minigames can not run. the error is id 1/id 17 unknown. I use gtx1080 ti , ubuntu16.04. I wonder if it has something to do with it?

question about sampling

Hey,

I wanted to ask about calculation in sample function.

return tf.argmax(tf.log(u) / probs, axis=1)

it divides from probs. Does that mean that lower probabilities have better chances to get picked? Better exploration???

Fix default logger

Current default is essentially /dev/null which is probably not the expected behavior for people trying to run reaver from inside their own codebase.

Screen features KeyError

Hello I m trying to run the script with these flags that essentially specify both screen and feature observations:

`parser = argparse.ArgumentParser()
parser.add_argument("--gpu", type=int, default=0)
parser.add_argument("--sz", type=int, default=32)

parser.add_argument("--feature_screen_size", type=int, default=84)
parser.add_argument("--feature_minimap_size", type=int, default=64)
#action space features 1, rgb 2 (needed if both rgb and features are on)
parser.add_argument("--action_space", type=str, default='features')

parser.add_argument("--rgb_screen_size", type=str, default="120")
parser.add_argument("--rgb_minimap_size", type=str, default="64")

parser.add_argument("--envs", type=int, default=32)
parser.add_argument("--render", type=int, default=1)
parser.add_argument("--steps", type=int, default=16)
parser.add_argument("--updates", type=int, default=1000000)
parser.add_argument('--lr', type=float, default=7e-4)
parser.add_argument('--vf_coef', type=float, default=0.25)
parser.add_argument('--ent_coef', type=float, default=1e-3)
parser.add_argument('--discount', type=float, default=0.99)
parser.add_argument('--clip_grads', type=float, default=1.)
parser.add_argument("--run_id", type=int, default=-1)
parser.add_argument("--map", type=str, default='MoveToBeacon')
parser.add_argument("--cfg_path", type=str, default='config.json.dist')
parser.add_argument("--test", type=bool, nargs='?', const=True, default=False)
parser.add_argument("--restore", type=bool, nargs='?', const=True, default=False)
parser.add_argument('--save_replay', type=bool, nargs='?', const=True, default=False)`

but I got this error:
return [self._preprocess(obs, _type) for _type in ['screen', 'minimap'] + self.feats['non_spatial']] File "/home/dstefanidis/starcraft_codes/pysc2-rl-agent/common/config.py", line 106, in _preprocess spatial = [[ob[_type][f.index] for f in self._feats(_type)] for ob in obs] File "/home/dstefanidis/starcraft_codes/pysc2-rl-agent/common/config.py", line 106, in <listcomp> spatial = [[ob[_type][f.index] for f in self._feats(_type)] for ob in obs] File "/home/dstefanidis/starcraft_codes/pysc2-rl-agent/common/config.py", line 106, in <listcomp> spatial = [[ob[_type][f.index] for f in self._feats(_type)] for ob in obs] KeyError: 'screen'

How to watch full graphics replay

Hello,

Can you please elaborate further on how you recorded the full graphics replay? I am currently using SC2 Linux version 4.1.2 and have been trying to watch the full graphics replay on Windows by logging into the Battle.net, but I keep failing to do so presumably because of the version difference. You briefly mentioned on the README about it, but can you please explain it in a little more detail on how you did it?

Thank you.

Windows support (multiprocessing / shared memory)

There's no os.fork() in Windows, so it seems that when I launch a new worker it re-creates ProcEnv object, which no longer has access to MultiProcEnv shared memory reference. Need to either rewrite how I give the reference or temporarily implement message-based communication instead for Windows.

Parameter "--sz" function

What does the parameter "--sz" mean in main.py?

Restore previous experiment result issue

Thank you for always kindly answering.

Suddenly the computer stopped when learning and tried to use the restore function to start with the learning results done so far. However, I discovered in the log file the phenomenon that reading was no longer progressing and stopped at a place.

In this case, which part is the problem?

From Dohyeong

CPU BiasOp only supports NHWC.

I received some errors when running the code.
InvalidArgumentError (see above for traceback): CPU BiasOp only supports NHWC. [[Node: Conv/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](Conv/convolution, Conv/biases/read)]]
I want to use 2 gpu so I modified the args.gpu=2.

Trouble with running PySC2

Hey I created an env on conda to test reaver and when i tried the command to try Beacon I had the logger issue. I followed your hotfix but I end up having weird errors depending on the agent I specify (A2C/PPO).
For PPO for instance I get this error :

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 258, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\base\msg_multiproc.py", line 48, in _run obs = self._env.reset() File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 73, in reset obs, reward, done = self.obs_wrapper(self._env.reset()) File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 258, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 258, in _bootstrap self.run() File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 130, in __call__ obs['feature_screen'][self.feature_masks['screen']], File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 145, in __getitem__ index = _get_index(obj, index) File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\envs\tf\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 207, in _get_index "Can't index by type: %s; only int, string or slice" % type(index)) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\base\msg_multiproc.py", line 48, in _run obs = self._env.reset() File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\base\msg_multiproc.py", line 48, in _run obs = self._env.reset() TypeError: Can't index by type: <class 'list'>; only int, string or slice File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 73, in reset obs, reward, done = self.obs_wrapper(self._env.reset()) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 73, in reset obs, reward, done = self.obs_wrapper(self._env.reset()) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 130, in __call__ obs['feature_screen'][self.feature_masks['screen']], File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\reaver\envs\sc2.py", line 130, in __call__ obs['feature_screen'][self.feature_masks['screen']], File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 145, in __getitem__ index = _get_index(obj, index) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 145, in __getitem__ index = _get_index(obj, index) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 207, in _get_index "Can't index by type: %s; only int, string or slice" % type(index)) File "C:\ProgramData\Anaconda3\envs\tf\lib\site-packages\pysc2\lib\named_array.py", line 207, in _get_index "Can't index by type: %s; only int, string or slice" % type(index)) TypeError: Can't index by type: <class 'list'>; only int, string or slice TypeError: Can't index by type: <class 'list'>; only int, string or slice

I love your work anyway and looking forward a fix thanks

Rewards clipping or scaling

Need to investigate if clipping or scaling rewards improves performance.

Does it even make sense if I'm already clipping grads?
How will the agent known that one action is better than other if both get reward = 1?

Marine stuck in 'MoveToBeacon'

Hi there! Thanks for your great work! 👍
But I met some unexpected problem, my environment is Window 10.
my code is as follow:

import reaver as rvr
from multiprocessing import Process

if __name__ == '__main__':
    p = Process()
    p.start()
    env = rvr.envs.SC2Env(map_name='MoveToBeacon')
    agent = rvr.agents.A2C(env.obs_spec(), env.act_spec()
                           , rvr.models.build_fully_conv, rvr.models.SC2MultiPolicy, n_envs=1)
    agent.run(env)

But I got those Traceback and Marine just will not move to anywhere:

Process Process-2:
Traceback (most recent call last):
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 52, in _run
    obs = self._env.reset()
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 69, in reset
    obs, reward, done = self.obs_wrapper(self._env.reset())
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 126, in __call__
    obs['feature_screen'][self.feature_masks['screen']],
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\named_array.py", line 145, in __getitem__
    index = _get_index(obj, index)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\named_array.py", line 207, in _get_index
    "Can't index by type: %s; only int, string or slice" % type(index))
TypeError: Can't index by type: <class 'list'>; only int, string or slice

and I also got stuck in 'CartPole-v0'. Nothing will be shown after I have wait for quite a few moment and my code is:

import reaver as rvr
from multiprocessing import Process

if __name__ == '__main__':
    p = Process()
    p.start()
    env = rvr.envs.GymEnv('CartPole-v0')
    agent = rvr.agents.A2C(env.obs_spec(), env.act_spec())
    agent.run(env)

Any idea about this, Thanks!

Error in loading pre-trained models for minigames

Hi just wanted to reproduce the results reported by downloading the zip files from releases

I ran into two issues:

In the unzipped folders, there seems to be missing the config.gin files
In an attempt to resolve the first issue, I started the training for the corresponding minigame and interrupted to get the config.gin file. But loading the model checkpoints seems to give another problem, giving the error message:
Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint.
Could it be due to the version of tensorflow?

Thanks again!

Tensorflow 2.X issue

Hi,

Thank you very much for the nice open-source project! After installation, I have a tf.summary.FileWriter is not compatible with eager execution error when I try agent = rvr.agents.A2C(env.obs_spec(), env.act_spec(), rvr.models.build_fully_conv, rvr.models.SC2MultiPolicy, n_envs=4). I think this is because of tensorflow version issue. I wonder how did you handle this issue!

Thanks

import reaver results in error

First, Following the install instructions to use source for both reaver and pysc2.

The error has changed. I realized I did not have TF-probability installed. However I'm still receiving an error.

When running import reaver i receive the fallowing error

Traceback (most recent call last):
File "", line 1, in
File "/home/hf/.local/lib/python3.6/site-packages/reaver/init.py", line 1, in
import reaver.envs
File "/home/hf/.local/lib/python3.6/site-packages/reaver/envs/init.py", line 6, in
from .gym import GymEnv
File "/home/hf/.local/lib/python3.6/site-packages/reaver/envs/gym.py", line 3, in
from reaver.envs.atari import AtariPreprocessing
File "/home/hf/.local/lib/python3.6/site-packages/reaver/envs/atari.py", line 29, in
import gin.tf
File "/home/hf/.local/lib/python3.6/site-packages/gin/tf/init.py", line 20, in
from gin.tf.utils import GinConfigSaverHook
File "/home/hf/.local/lib/python3.6/site-packages/gin/tf/utils.py", line 34, in
config.register_file_reader(tf.io.gfile.GFile, tf.io.gfile.exists)
AttributeError: module 'tensorflow._api.v1.io' has no attribute 'gfile'

=====\==============//====================

OLD ERROR:
When running import reaver i receive the fallowing error

  File "<stdin>", line 1, in <module>
  File "/home/hf/.local/lib/python3.6/site-packages/reaver/__init__.py", line 1, in <module>
    import reaver.envs
  File "/home/hf/.local/lib/python3.6/site-packages/reaver/envs/__init__.py", line 2, in <module>
    from .sc2 import SC2Env
  File "/home/hf/.local/lib/python3.6/site-packages/reaver/envs/sc2.py", line 5, in <module>
    from pysc2.lib import actions
ModuleNotFoundError: No module named 'pysc2.lib'

However I have no problems with 'import pysc2'

Max pooling

Need to try adding max pool layer into the model.
Intuitively agent might benefit from spatial translation invariance on some maps like DefeatRoaches.
Why doesn't DM use it?

Is this a bug in runner.py?

Thank you for the great codes. When I tried new maps, I found some problems in runner.py. When there are more than one env, one env have done before others, then it is going to restart the game. At the end, all envs are done, the calculated rewards contain many episodes, which is a much bigger number. If you understand what I am talking about, please tell me is there any problem?

Unstable performance, sometimes agent converges to no_op action

Thank you for great release. I try to train an agent on CollectMineralShards, but can not repeat the performance as reported. I made several tries, but only get reward=75 at 100k steps. Is there any config parameters I should change? Thanks~

Random seed setting

Hi, inoryy.
How did you set your random seed, as shown in your learning curves?

Fail to run the demo

When I run the demo code shown on the readme page, there is a error occured as below.
RuntimeError: v1.summary.FileWriter is not compatible with eager execution. Use tf.summary.create_file_writer,or a with v1.Graph().as_default(): context

Simplify setup

Create an env.yml for conda
Maybe docker container?
- How to handle sc2 installation? Blizzard explicitly requires EULA agreement

Implementaion of transfomer into model

Hello, thank you for sharing good code.

I am trying to solve a DefeatRoaches minigame by using a Relational Network.

I found a example code of Transformer for MNIST classification and modified a fully_conv.py file for it. Unlike original code, I only use a screen feature without a minimap feature. But, result is still not good.

Would you like give me recommendation how to modify it for reaching performance of DeepMind?

Thank you.
From Dohyeong

Implementation code : https://github.com/kimbring2/pysc2_transformer/blob/master/graph_network.py
Train result :

Separate std variable for continuous policies

Seems using a single, separate variable for (log?) standard deviation is more popular than making it part of the network, e.g. (Schulman et al., 2015). Should probably use this way instead of currently implemented, at least while comparing algorithms against baselines.

Can't use tf.get_variable() though, goes away in 2.0.

Update code for newer dependencies

Need to update code for PySC2 v2.0. Maybe TensorFlow?

Related to #6

No way to save replays

After carefully inspecting the code, I don't see a way to specify the replay directory or flag. However in previous versions, it seems that this functionality was included.

Some errors when running the code

I have received some errors when running the code. But I don't know why this happens.

Process Process-1:
Traceback (most recent call last):
File "C:\Anaconda3\lib\multiprocessing\process.py", line 252, in _bootstrap
self.run()
File "C:\Anaconda3\lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "E:\liupenghui\pysc2-rl-agent-master\common\env.py", line 22, in worker
env = env_fn_wrapper.x()
File "E:\liupenghui\pysc2-rl-agent-master\common\env.py", line 14, in _thunk
env = sc2_env.SC2Env(**params)
File "C:\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 132, in init
self._setup((agent_race, bot_race, difficulty), **kwargs)
File "C:\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 173, in _setup
self.run_config = run_configs.get()
File "C:\Anaconda3\lib\site-packages\pysc2\run_configs_init.py", line 38, in get
if FLAGS.sc2_run_config is None: # Find the highest priority as default.
File "C:\Anaconda3\lib\site-packages\absl\flags_flagvalues.py", line 488, in getattr
raise _exceptions.UnparsedFlagAccessError(error_message)
absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --sc2_run_config before flags were parsed.

Question about performance on BuildMarines

Hi @inoryy , in addition to the results on these minigames. I notice there isn't any results on BuildMarines, may I ask if there is an update or if there is a planned follow-up?

By the way, awesome repo!

RNN based agent

AlphaStar uses LSTM right? Why there is no RNN in your rvr.models.build_fully_conv ?

Issue with running on google colab

the last step,it shows:
ERROR: ld.so: object '/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

code error when running the test code

import reaver as rvr
env = rvr.envs.SC2Env(map_name='MoveToBeacon')
agent = rvr.agents.A2C(env.obs_spec(), env.act_spec(), rvr.models.build_fully_conv,rvr.models.SC2MultiPolicy, n_envs=1)
agent.run(env)

1st error :..pysc2/lib/features.py:737: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.

2ed error:
.../pysc2/lib/named_array.py", line 208, in _get_index
"Can't index by type: %s; only int, string or slice" % type(index))

I would like to ask how much memory is needed to run the code?

I received some errors when running the code.
OOM when allocating tensor with shape[512,1850,32,32]
So I would like to ask how much memory is needed to run this code?
Thank you.

error with Function xx is currently not available

Hi, thank you for great reaver,
I test the run.py with --env MoveToBeacon --agent ppo --n_envs 1 in macOS without GPU, but get follow error

Traceback (most recent call last):
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/reaver/envs/base/shm_multiproc.py", line 48, in _run
obs, rew, done = self._env.step(data)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/reaver/envs/sc2.py", line 87, in step
obs, reward, done = self.obs_wrapper(self._env.step(self.act_wrapper(action)))
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/lib/stopwatch.py", line 212, in _stopwatch
return func(*args, **kwargs)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/env/sc2_env.py", line 537, in step
actions = [[f.transform_action(o.observation, a, skip_available=skip)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/env/sc2_env.py", line 537, in
actions = [[f.transform_action(o.observation, a, skip_available=skip)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/env/sc2_env.py", line 537, in
actions = [[f.transform_action(o.observation, a, skip_available=skip)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/lib/stopwatch.py", line 212, in _stopwatch
return func(*args, **kwargs)
File "/Users/xx/sc2/venv/lib/python3.8/site-packages/pysc2/lib/features.py", line 1608, in transform_action
raise ValueError("Function %s/%s is currently not available" % (
ValueError: Function 331/Move_screen is currently not available

the error about action id is not same every time.
then I run the command python3 -m pysc2.bin.agent --map MoveToBeacon, the result:

I1023 10:11:01.933034 4632368576 sc2_env.py:506] Starting episode 1: [terran] on MoveToBeacon
0/no_op ()
1/move_camera (1/minimap [64, 64])
2/select_point (6/select_point_act [4]; 0/screen [84, 84])
3/select_rect (7/select_add [2]; 0/screen [84, 84]; 2/screen2 [84, 84])
4/select_control_group (4/control_group_act [5]; 5/control_group_id [10])
7/select_army (7/select_add [2])
453/Stop_quick (3/queued [2])
451/Smart_screen (3/queued [2]; 0/screen [84, 84])
452/Smart_minimap (3/queued [2]; 1/minimap [64, 64])
331/Move_screen (3/queued [2]; 0/screen [84, 84])
332/Move_minimap (3/queued [2]; 1/minimap [64, 64])
333/Patrol_screen (3/queued [2]; 0/screen [84, 84])
334/Patrol_minimap (3/queued [2]; 1/minimap [64, 64])
12/Attack_screen (3/queued [2]; 0/screen [84, 84])
13/Attack_minimap (3/queued [2]; 1/minimap [64, 64])
274/HoldPosition_quick (3/queued [2])

my pysc2 version is 3.0.0 , reaver version is 2.1.9
I had set the ensure_available_actions=False and it works,but I dont think it`s good idea

How to use the Plot.py util file?

Hi @inoryy ,
I am trying to figure out how to figure out how to use this plot.py file in your utils folder, I want more data on the experiment I am running.

Sorry if this is basic, but could you explain how you use this util file?
As in do I have to call this from command line or a python file?

Dimension of screen feature

Hello,

I am trying to apply Relational Network in DefeatRoaches environment using the code you uploaded.
The size of the screen feature is 16, but is it too small to affect performance?

I want to know that the performance graph shown on the web page is the performance when using the size of the screen feature.

From Dohyeong Kim

Possible to make agents play against eacher other?

Hello!

I am new to Reinforcement Learning, and really wanted to try and implement a model that would be able to play itself, and I found your awesome project!

Is there a way to make it play itself, speedup?