devsisters / dqn-tensorflow Goto Github PK
View Code? Open in Web Editor NEWTensorflow implementation of Human-Level Control through Deep Reinforcement Learning
License: MIT License
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning
License: MIT License
or how did you plot all of your figures
I run this code in cpu and this error occurred.
TimeLimit' object has no attribute 'ale' Can anyone show me how to solve this matter?
Thank you !
Traceback (most recent call last):
File "main.py", line 69, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 56, in main
raise Exception("use_gpu flag is true when no GPUs are available")
Exception: use_gpu flag is true when no GPUs are available
The above is the error message.
Could you help me out??
The function clipped_error is not correctly written. Indeed, the try and except parts are the same !
Currently, it is written as :
def clipped_error(x):
try:
return tf.select(tf.abs(x) < 1.0, 0.5 * tf.square(x), tf.abs(x) - 0.5)
except:
return tf.where(tf.abs(x) < 1.0, 0.5 * tf.square(x), tf.abs(x) - 0.5)
The function should be
def clipped_error(x):
return tf.where(tf.abs(x) < 1.0, 0.5 * tf.square(x), tf.abs(x) - 0.5)
For on the order of 100M iteration, what was the required amount of RAM in your case? 16GB GPU EC2 instance with 2GB GPU apparently has no enough memory and locks itself down.
Hi. I am trying to understand the code and I came across what I think is a bug in:
Line 32 in c7b1f10
self.env.new_random_game()
and afterwards the history is filled with the new random state via self.history.add(screen)
, which is needed because the agent always chooses its actions taking that history as input via action = self.predict(self.history.get())
.
When a terminal state is reached a new random game is created but the new random state is not added to the history this time. This causes that the agent will use the terminal state of the last episode to decide which action to take in the first state of the new episode, which I think is wrong.
A way to fix it would be to add
for _ in range(self.history_length):
self.history.add(screen)
after this line.
I don't know if fixing this would have any positive impact on performance since it only affects the first self.history_length
steps of each episode but anyways I wanted to share it.
Thanks in advance.
in dqn/agent.py line 59
if terminal:
screen, reward, action, terminal = self.env.new_random_game()
when starting a new game due to a terminal state.
why we don't need to reset the self.history?
because it would affect the next iteration.
# 1. predict
action = self.predict(self.history.get())
# 2. act
screen, reward, terminal = self.env.act(action, is_training=True)
# 3. observe
self.observe(screen, reward, action, terminal)
the predicted action for self.history.get() is not depending on the current game screens, it will predict action for the previous game screen, which is ended, instead.
Do I miss anything?
Thank you very much.
Hi, can you share a configuration that can reproduce the results you showed on the figure?
I run the default M1 configuration and only get average episodic reward at around 3.
I tried to change the configurations like setting action_repeat = 4, change learning_rate, add double_q and duel_q, there is no much change.
Many thanks!
"Segmentation fault (core dumped)" while trying to run it.
I have no GPU configured with tensorflow. I suspect thats the reason. Is there any way to make it work just with the CPU?
Tried a couple of flags, but they didn't work.
python main.py --env_name=Breakout-v0 --is_train=True --display=True --cpu=True
Hi!
First of all, I really thank you for your share!
I set all the setting that needed to run it.
I could implement the process to train a model for breakout
(python main.py --env_name=Breakout-v0 --is_train=True --display=True)
However, When i tried to run it as test and record
(python main.py --is_train=False --display=True)
I've got this error:
InvalidArgumentError (see above for traceback): CPU BiasOp only supports NHWC.
[[Node: prediction/l1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/l1/Conv2D, prediction/l1/biases/read)]]
Could you please help me out?
Is it possible to download the trained network and use it, to see how it plays the game?
This is the log of the events. What should I do? Thanks!
iMac:DQN-tensorflow shyamalsuhanachandra$ python main.py --env_name=Alien-v0 --is_train=True --display=True
[*] GPU : 1.0000
[2016-09-27 17:28:27,334] Making new env: Alien-v0
{'_save_step': 500000,
'_test_step': 50000,
'action_repeat': 4,
'backend': 'tf',
'batch_size': 32,
'cnn_format': 'NCHW',
'discount': 0.99,
'display': True,
'double_q': False,
'dueling': False,
'env_name': 'Alien-v0',
'env_type': 'detail',
'ep_end': 0.1,
'ep_end_t': 1000000,
'ep_start': 1.0,
'history_length': 4,
'learn_start': 50000.0,
'learning_rate': 0.00025,
'learning_rate_decay': 0.96,
'learning_rate_decay_step': 50000,
'learning_rate_minimum': 0.00025,
'max_delta': 1,
'max_reward': 1.0,
'max_step': 50000000,
'memory_size': 1000000,
'min_delta': -1,
'min_reward': -1.0,
'model': 'm1',
'random_start': 30,
'scale': 10000,
'screen_height': 84,
'screen_width': 84,
'target_q_update_step': 10000,
'train_frequency': 4}
[*] Loading checkpoints...
[!] Load FAILED: checkpoints/Alien-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/double_q-False/memory_size-1000000/action_repeat-4/ep_end_t-1000000/dueling-False/min_reward--1.0/backend-tf/random_start-30/scale-10000/env_type-detail/learning_rate_decay_step-50000/ep_start-1.0/screen_width-84/learn_start-50000.0/cnn_format-NCHW/learning_rate-0.00025/batch_size-32/discount-0.99/max_step-50000000/max_reward-1.0/learning_rate_decay-0.96/learning_rate_minimum-0.00025/env_name-Alien-v0/ep_end-0.1/model-m1/screen_height-84/
2016-09-27 17:28:28.996 Python[26135:3913383] ApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to /var/folders/m1/b_t9_2151y30ryvtr_2gznch0000gp/T/org.python.python.savedState
0%| | 49999/50000000 [14:17<235:26:28, 58.93it/s]E tensorflow/core/common_runtime/executor.cc:334] Executor failed to create kernel. Invalid argument: CPU BiasOp only supports NHWC.
[[Node: target/target_l1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](target/target_l1/Conv2D, target/target_l1/biases/read)]]
Traceback (most recent call last):
File "main.py", line 66, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 61, in main
agent.train()
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/agent.py", line 56, in train
self.observe(screen, reward, action, terminal)
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/agent.py", line 135, in observe
self.q_learning_mini_batch()
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/agent.py", line 157, in q_learning_mini_batch
q_t_plus_1 = self.target_q.eval({self.target_s_t: s_t_plus_1})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 559, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3656, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 710, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 908, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: CPU BiasOp only supports NHWC.
[[Node: target/target_l1/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NCHW", _device="/job:localhost/replica:0/task:0/cpu:0"](target/target_l1/Conv2D, target/target_l1/biases/read)]]
Caused by op u'target/target_l1/BiasAdd', defined at:
File "main.py", line 66, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 58, in main
agent = Agent(config, env, sess)
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/agent.py", line 29, in __init__
self.build_dqn()
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/agent.py", line 240, in build_dqn
32, [8, 8], [4, 4], initializer, activation_fn, self.cnn_format, name='target_l1')
File "/Users/shyamalsuhanachandra/DQN-tensorflow/dqn/ops.py", line 25, in conv2d
out = tf.nn.bias_add(conv, b, data_format)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 391, in bias_add
return gen_nn_ops._bias_add(value, bias, data_format=data_format, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 279, in _bias_add
data_format=data_format, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
self._traceback = _extract_stack()
When I clone the repository, I get a message saying that the cloning succeeded but that the checkout failed:
$ git clone https://github.com/devsisters/DQN-tensorflow.git
Cloning into 'DQN-tensorflow'...
remote: Counting objects: 717, done.
remote: Total 717 (delta 0), reused 0 (delta 0), pack-reused 717
Receiving objects: 100% (717/717), 29.57 MiB | 1.26 MiB/s, done.
Resolving deltas: 100% (446/446), done.
fatal: cannot create directory at 'checkpoints/Breakout-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/memory_size-1000000/action_repeat-4/ep_end_t-1000000/backend-tf/random_start-30/scale-10000/env_type-simple/min_reward--1.0': Filename too long
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry the checkout with 'git checkout -f HEAD'
Apparently the path length becomes too long. In Windows, there is by default a max limitation of 260 characters for path lengths. It should be possible to turn this limitation off, but doing so doesn't seem to work for me (I even noticed that LongPathsEnabled
already was 1 since earlier).
Perhaps not surprisingly, git checkout -f HEAD
doesn't work either, and results in a very similar error message:
$ cd DQN-tensorflow/
$ git checkout -f HEAD
fatal: cannot create directory at 'checkpoints/Breakout-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/memory_size-1000000/action_repeat-4/ep_end_t-1000000/backend-tf/random_start-30/scale-10000': Filename too long
When I use python3.6 to implement the program, the error shows :
Traceback (most recent call last):
File "main.py", line 70, in
tf.app.run()
File "/home/tanggy/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "main.py", line 62, in main
agent = Agent(config, env, sess)
File "/home/tanggy/Downloads/DQN-tensorflow-master/dqn/agent.py", line 30, in init
self.build_dqn()
File "/home/tanggy/Downloads/DQN-tensorflow-master/dqn/agent.py", line 328, in build_dqn
self._saver = tf.train.Saver(self.w.values() + [self.step_op], max_to_keep=30)
TypeError: unsupported operand type(s) for +: 'dict_values' and 'list
Code requires:
from functools import reduce #py2.7 reduce > py3.x functools.reduce
s/xrange/range #py2.7 xrange > py3.x range
Issue after those are fixed:
Traceback (most recent call last):
File "main.py", line 70, in <module>
tf.app.run()
File "/usr/local/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 62, in main
agent = Agent(config, env, sess)
File "/Users/emmanuel.mwangi/open_source/laml/DQN-tensorflow/dqn/agent.py", line 31, in __init__
self.build_dqn()
File "/Users/emmanuel.mwangi/open_source/laml/DQN-tensorflow/dqn/agent.py", line 329, in build_dqn
self._saver = tf.train.Saver(self.w.values() + [self.step_op], max_to_keep=30)
TypeError: unsupported operand type(s) for +: 'dict_values' and 'list'
This suggests that the maximum of the minimum learning rate and exponentially decayed rate is calculated. But in the configurations file, both the learning rate and the minimum learning rate are supplied the same values. This will result in no updates to the learning rate with more training steps.
OR Is this specifically for the case with no updates in the learning rate?
Thanks.
Hi, I found that this implementation is slower than deep_q_rl which is implemented by theano.
Is it because this repo used openai gym rather than rom files?
Or the performance between Thesorflow and Theano? Or any other details?
deep_q_rl runs 100-200 steps at learning process.
But DQN-tensorflow just runs 70-90 steps at learning process. It makes the training slow, and cannot run 200M in 10 days as dqn nature paper.
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7465
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.63GiB
2018-04-09 16:56:08.725121: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-09 16:56:09.233091: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-09 16:56:09.236868: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:917] 0
2018-04-09 16:56:09.239291: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:930] 0: N
2018-04-09 16:56:09.242389: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8192 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-09 16:56:09.249227: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 8.00G (8589934592 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-04-09 16:56:09.253673: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 7.20G (7730940928 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Traceback (most recent call last):
File "main.py", line 75, in
tf.app.run()
File "D:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
_sys.exit(main(argv))
File "main.py", line 53, in main
config = get_config(FLAGS) or FLAGS
File "D:\tools\dqq\DQN-tensorflow-master\config.py", line 58, in get_config
I am running Windows 10 on a 2017 MacBook Pro using BootCamp so as far as I know I can't use gpu
When I run python main.py --env_name=Breakout-v0 --is_train=True --display=True --use_gpu=False
I get this output
[*] GPU : 1.0000
2018-01-06 17:07:58.156417: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
{'_save_step': 500000,
'_test_step': 50000,
'action_repeat': 4,
'backend': 'tf',
'batch_size': 32,
'cnn_format': 'NHWC',
'discount': 0.99,
'display': True,
'double_q': False,
'dueling': False,
'env_name': 'Breakout-v0',
'env_type': 'detail',
'ep_end': 0.1,
'ep_end_t': 1000000,
'ep_start': 1.0,
'history_length': 4,
'learn_start': 50000.0,
'learning_rate': 0.00025,
'learning_rate_decay': 0.96,
'learning_rate_decay_step': 50000,
'learning_rate_minimum': 0.00025,
'max_delta': 1,
'max_reward': 1.0,
'max_step': 50000000,
'memory_size': 1000000,
'min_delta': -1,
'min_reward': -1.0,
'model': 'm1',
'random_start': 30,
'scale': 10000,
'screen_height': 84,
'screen_width': 84,
'target_q_update_step': 10000,
'train_frequency': 4}
Traceback (most recent call last):
File "main.py", line 70, in <module>
tf.app.run()
File "C:\Users\Unicoranium\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 62, in main
agent = Agent(config, env, sess)
File "C:\Users\Unicoranium\Desktop\MachineLearning\DQN-tensorflow\DQN-tensorflow-master\dqn\agent.py", line 30, in __init__
self.build_dqn()
File "C:\Users\Unicoranium\Desktop\MachineLearning\DQN-tensorflow\DQN-tensorflow-master\dqn\agent.py", line 201, in build_dqn
self.l3_flat = tf.reshape(self.l3, [-1, reduce(lambda x, y: x * y, shape[1:])])
NameError: name 'reduce' is not defined```
How can I fix it?
Hi,
I want to make my own ale models for DQN Reinforce Learning using my video file input.
In video images, there are class labels in the corner for supervised training.
How can I make my own ALE model?
Thank you in advances.
Code of line 62 in replay_memory.py:
index = random.randint(self.history_length, self.count - 1)
should change to
index = random.randint(self.history_length, self.count)
Correct ?
I searched for preprocess and couldn't find any thing. Am I missing something? Or is there a reason about that? I notice that the imlement of ReplayMemory is different from original description in the article.
How do I run the training for longer?
When i do :
python main.py --is_train=False --display=True --use_gpu=False
I get :
[*] GPU : 1.0000 [2018-05-23 17:17:55,692] Making new env: Breakout-v0 {'_save_step': 500000, '_test_step': 50000, 'action_repeat': 4, 'backend': 'tf', 'batch_size': 32, 'cnn_format': 'NHWC', 'discount': 0.99, 'display': True, 'double_q': False, 'dueling': False, 'env_name': 'Breakout-v0', 'env_type': 'detail', 'ep_end': 0.1, 'ep_end_t': 1000000, 'ep_start': 1.0, 'history_length': 4, 'learn_start': 50000.0, 'learning_rate': 0.00025, 'learning_rate_decay': 0.96, 'learning_rate_decay_step': 50000, 'learning_rate_minimum': 0.00025, 'max_delta': 1, 'max_reward': 1.0, 'max_step': 50000000, 'memory_size': 1000000, 'min_delta': -1, 'min_reward': -1.0, 'model': 'm1', 'random_start': 30, 'scale': 10000, 'screen_height': 84, 'screen_width': 84, 'target_q_update_step': 10000, 'train_frequency': 4} Traceback (most recent call last): File "main.py", line 70, in tf.app.run() File "/Tuto_DQN/env/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "main.py", line 62, in main agent = Agent(config, env, sess) File "/Tuto_DQN/tuto_dqn/DQN-tensorflow/dqn/agent.py", line 23, in init self.memory = ReplayMemory(self.config, self.model_dir) File "/Tuto_DQN/tuto_dqn/DQN-tensorflow/dqn/replay_memory.py", line 18, in init self.screens = np.empty((self.memory_size, config.screen_height, config.screen_width), dtype = np.float16) MemoryError
I installed all the dependencies according to the issue #44 Add requirements.txt or alternative
I am running it on my laptop which is a samsung series 7 ultra notebook
Could someone advise me on how to overcome this issue? any comment would be highly appreciated
Thanks a lot!!
Please add trained Snapshots of one or two atari roms.
I tried to run from the windows command line using the command in Readme but it says "Import Error: No module named tensorflow." How do I run the program on Windows? I am using Anaconda.
I have a Titan X and have been running the Breakout simulation for over two days now and it's only 7%
through training and nvidia-smi
is showing that it's only using 4-5%
. The README.md says that it only took 30 hours on a 980. That doesn't seem right. According to main.py
, it should be using 100%
by default if I don't give the flag. Is anyone else having this issue or is it just me?
Edit:
nvidia-smi -i 0 -q -d MEMORY,UTILIZATION,POWER,CLOCK,COMPUTE
shows that FB Memory Usage
is 11423 MiB/ 12185 Mib
. Does that look correct if using the default GPU setting for Breakout?
Hello, I spotted what I believe might be a bug in the DQN implementation on line 291 here:
https://github.com/devsisters/DQN-tensorflow/blob/master/dqn/agent.py#L291
The code tries to clip the self.delta
with tf.clip_by_value
, I assume with the intention of being robust when the discrepancy in Q is above a threshold:
self.delta = self.target_q_t - q_acted
self.clipped_delta = tf.clip_by_value(self.delta, self.min_delta, self.max_delta, name='clipped_delta')
self.global_step = tf.Variable(0, trainable=False)
self.loss = tf.reduce_mean(tf.square(self.clipped_delta), name='loss')
However, the clip_by_value
function's local gradient outside of the min_delta, max_delta
range is zero. Therefore, with the current code whenever the discrepancy is above min/max delta, the gradient becomes exactly zero in backprop. This might not be what you intend, and is certainly not standard, I believe.
I think you probably want to clip the gradient here, not the raw Q. In that case you would have to use the Huber loss:
def clipped_error(x):
return tf.select(tf.abs(x) < 1.0, 0.5 * tf.square(x), tf.abs(x) - 0.5) # condition, true, false
and use this on this.delta
instead of tf.square
. This would have the desired effect of increased robustness to outliers.
Hi, I downloaded the codes, and then test it as it described here.
However, I got this error as follows,
I think, all requirements are installed except opencv2 and openAI gym was tested.
I would appreciate that someone finds the cause and the solution.
Traceback (most recent call last):
File "/DQN-tensorflow-master/main.py", line 69, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/DQN-tensorflow-master/main.py", line 64, in main
agent.train()
File "/DQN-tensorflow-master/dqn/agent.py", line 40, in train
screen, reward, action, terminal = self.env.new_random_game()
File /DQN-tensorflow-master/dqn/environment.py", line 28, in new_random_game
self.new_game(True)
File "/DQN-tensorflow-master/dqn/environment.py", line 21, in new_game
if self.lives == 0:
File "/DQN-tensorflow-master/dqn/environment.py", line 52, in lives
return self.env.ale.lives()
AttributeError: 'TimeLimit' object has no attribute 'ale'
I was reading through the code and couldn't figure out where Agent.learning_rate
is being set. It's used here:
https://github.com/devsisters/DQN-tensorflow/blob/master/dqn/agent.py#L299
But it's only set on the Config
object, not the Agent
.
The best reward would be 30, that all. But by replacing Gym with ROM directly, the output would be very different, very stable reward around 300~400
I dont' know exactly what's wrong with Gym
Detailed mode의 경우 목숨이 1개 줄어들면 에피소드가 끝나도록 되어있어서 simple mode와 점수가 5배 이상 차이나게 됩니다.
Detailed mode에서 목숨이 줄어든 것으로 바로 terminal=True를 주면 바로 새 랜덤 게임을 실행하게 되는 것이 문제입니다.
그래서 그래프를 보면 simple mode가 detailed mode보다 좋은 것처럼 보이지만 실은 둘을 비교해서는 안되는 조건 하에 있는 것입니다. 게다가 M2(purple)이 step=1M 쯤에서부터 안보이는군요. 그래프를 잘못해석할 여지가 있다고 봅니다.
Hi i want to run your code to train agent on other games.
Is there any code or hyperparameter that needs to be changed in order to train a nice agent ?
Following link in the README points to an empty page.
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.
Maybe you could upload the PDF in this repo?
where is the m2 and m3?
In gym, terminal is True only when lives is 0. But in act() function of GymEnvironment class, it seems that terminal is True when live decrease. It will influence num_game and episode reward.
When running DQN with --display option getting the following error
Traceback (most recent call last):
File "main.py", line 66, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "main.py", line 61, in main
agent.train()
File "/home/savvai/Documents/DQN-tensorflow/dqn/agent.py", line 40, in train
screen, reward, action, terminal = self.env.new_random_game()
File "/home/savvai/Documents/DQN-tensorflow/dqn/environment.py", line 28, in new_random_game
self.new_game(True)
File "/home/savvai/Documents/DQN-tensorflow/dqn/environment.py", line 24, in new_game
self.render()
File "/home/savvai/Documents/DQN-tensorflow/dqn/environment.py", line 60, in render
self.env.render()
File "/usr/local/lib/python2.7/dist-packages/gym/core.py", line 174, in render
return self._render(mode=mode, close=close)
File "/usr/local/lib/python2.7/dist-packages/gym/envs/atari/atari_env.py", line 119, in _render
from gym.envs.classic_control import rendering
File "/usr/local/lib/python2.7/dist-packages/gym/envs/classic_control/rendering.py", line 23, in
from pyglet.gl import *
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/init.py", line 236, in
import pyglet.window
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/init.py", line 1817, in
gl._create_shadow_window()
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/init.py", line 205, in _create_shadow_window
_shadow_window = Window(width=1, height=1, visible=False)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/xlib/init.py", line 163, in init
super(XlibWindow, self).init(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pyglet/window/init.py", line 505, in init
config = screen.get_best_config(template_config)
File "/usr/local/lib/python2.7/dist-packages/pyglet/canvas/base.py", line 161, in get_best_config
configs = self.get_matching_configs(template)
File "/usr/local/lib/python2.7/dist-packages/pyglet/canvas/xlib.py", line 179, in get_matching_configs
configs = template.match(canvas)
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/xlib.py", line 29, in match
have_13 = info.have_version(1, 3)
File "/usr/local/lib/python2.7/dist-packages/pyglet/gl/glx_info.py", line 89, in have_version
client = [int(i) for i in client_version.split('.')]
ValueError: invalid literal for int() with base 10: 'None'
I only find M1 in config.py, can you push M2? many thanks!
hi, i run this code with the initial parameters: model='m1' with game breakout-v0. after about 8days' gpu training, when the program finished, i evaluated the model and got average-reward 22.0, which has a big difference with your screenshot(score=300+)。
And another experiment only with the switch(duel=True, double_q=True) on, the model's average-reward equals 5.4. Even worse than the original DQN.
Is there any trick i missed ? thanks for your replay!
After following the install instructions and running
python main.py --env_name=Breakout-v0 --is_train=True
I receive the following error
Traceback (most recent call last):
File "main.py", line 70, in <module>
tf.app.run()
File "/home/jdmartin86/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "main.py", line 49, in main
config = get_config(FLAGS) or FLAGS
File "/home/jdmartin86/sandbox/test-qlearn/DQN-tensorflow/config.py", line 58, in get_config
for k, v in FLAGS.__dict__['__flags'].items():
KeyError: '__flags'
Hi!
Currently, I try to get this project running. However, the README does not exactly describe which requirements are necessary and issues like #29 arise. It would be useful for me and others, if there is a requirements.txt
or maybe even a Dockerfile
in order to get this running on any platform.
Cheers
René
EDIT:
For me following requirements.txt
work:
atari-py==0.0.21
Box2D-kengz==2.3.3
certifi==2017.7.27.1
chardet==3.0.4
funcsigs==1.0.2
gym==0.7.0
idna==2.6
imageio==2.2.0
Keras==2.0.8
mock==2.0.0
mujoco-py==0.5.7
numpy==1.13.3
olefile==0.44
pachi-py==0.0.21
pbr==3.1.1
Pillow==4.3.0
protobuf==3.1.0
pyglet==1.2.4
PyOpenGL==3.1.0
PyYAML==3.12
requests==2.18.4
scipy==1.0.0
six==1.11.0
tensorflow==0.12.0
Theano==0.9.0
tqdm==4.19.4
urllib3==1.22
In addition I had to run following commands:
brew install swig
brew install cmake
When I set is_train=False and display=True in the code, the screen outputs the image flows with rather low quality and without color. Why? Could someone help me?
When I run training with python main.py --env_name=Breakout-v0 --is_train=True --display=True --cpu=True
, I got this output after a couple of training episodes:
python main.py --env_name=Breakout-v0 --is_train=True --display=True --cpu=True
[_] GPU : 0.5000
[2016-05-20 17:00:38,585] Making new env: Breakout-v0
{'_save_step': 50000,
'test_step': 10000,
'action_repeat': 4,
'backend': 'tf',
'batch_size': 32,
'cnn_format': 'NHWC',
'discount': 0.99,
'display': True,
'env_name': 'Breakout-v0',
'env_type': 'simple',
'ep_end': 0.1,
'ep_end_t': 1000000,
'ep_start': 1.0,
'history_length': 4,
'learn_start': 50000.0,
'learning_rate': 0.00025,
'max_delta': 1,
'max_reward': 1.0,
'max_step': 50000000,
'memory_size': 1000000,
'min_delta': -1,
'min_reward': -1.0,
'model': 'm2',
'random_start': 30,
'scale': 10000,
'screen_height': 84,
'screen_width': 84,
'target_q_update_step': 10000,
'train_frequency': 4}
[] Loading checkpoints...
[!] Load FAILED: checkpoints/Breakout-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/memory_size-1000000/action_repeat-4/ep_end_t-1000000/backend-tf/random_start-30/scale-10000/env_type-simple/min_reward--1.0/ep_start-1.0/screen_width-84/learn_start-50000.0/cnn_format-NHWC/learning_rate-0.00025/batch_size-32/discount-0.99/max_reward-1.0/max_step-50000000/env_name-Breakout-v0/ep_end-0.1/model-m2/screen_height-84/
2016-05-20 17:00:40.195 Python[25567:405995] ApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to /var/folders/t0/tw1pt8nn5xv2ykn_4tmnxg5m0000gn/T/org.python.python.savedState
0%| | 49978/50000000 [02:47<39:09:30, 354.33it/s]
Traceback (most recent call last):
File "main.py", line 63, in
tf.app.run()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 58, in main
agent.train()
File "/Users/x0r/Documents/codes/DQN-tensorflow/dqn/agent.py", line 110, in train
if max_avg_ep_reward >= avg_ep_reward * 0.9:
UnboundLocalError: local variable 'avg_ep_reward' referenced before assignment
what's the difference between simple game and detail game
Hi,
I encounter a problem when I train the Breakout-v0 on GPU on a linux machine, and then want to load the model on my local Mac. Although they are using the same settings (except GPU vs. CPU. I also make sure that the CNN format is the same as GPU when I load it on my Mac), on my Mac the loading of the model is unsuccessful, as shown in the following error:
[*] Loading checkpoints...
INFO:tensorflow:Restoring parameters from checkpoints/Breakout-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/double_q-False/memory_size-1000000/action_repeat-4/ep_end_t-1000000/dueling-False/min_reward--1.0/backend-tf/random_start-30/scale-10000/env_type-detail/learning_rate_decay_step-50000/ep_start-1.0/screen_width-84/learn_start-50000.0/cnn_format-NCHW/learning_rate-0.00025/batch_size-32/discount-0.99/max_step-50000000/max_reward-1.0/learning_rate_decay-0.96/learning_rate_minimum-0.00025/env_name-Breakout-v0/ep_end-0.1/model-m1/screen_height-84/-3250000
[2017-09-24 19:36:32,049] Restoring parameters from checkpoints/Breakout-v0/min_delta--1/max_delta-1/history_length-4/train_frequency-4/target_q_update_step-10000/double_q-False/memory_size-1000000/action_repeat-4/ep_end_t-1000000/dueling-False/min_reward--1.0/backend-tf/random_start-30/scale-10000/env_type-detail/learning_rate_decay_step-50000/ep_start-1.0/screen_width-84/learn_start-50000.0/cnn_format-NCHW/learning_rate-0.00025/batch_size-32/discount-0.99/max_step-50000000/max_reward-1.0/learning_rate_decay-0.96/learning_rate_minimum-0.00025/env_name-Breakout-v0/ep_end-0.1/model-m1/screen_height-84/-3250000
InvalidArgumentError Traceback (most recent call last)
<ipython-input-4-6320008d113d> in <module>()
17 config.cnn_format = 'NHWC'
18
---> 19 agent = Agent(config, env, sess)
20
21 if FLAGS.is_train:
/Users/tailin/Dropbox (Personal)/project/meta_learning/dqn/agent.pyc in __init__(self, config, environment, sess)
31 self.step_assign_op = self.step_op.assign(self.step_input)
32
---> 33 self.build_dqn()
34
35 def train(self):
/Users/tailin/Dropbox (Personal)/project/meta_learning/dqn/agent.pyc in build_dqn(self)
340 self._saver = tf.train.Saver(self.w.values() + [self.step_op], max_to_keep=30)
341
--> 342 self.load_model()
343 self.update_target_q_network()
344
/Users/tailin/Dropbox (Personal)/project/meta_learning/dqn/base.pyc in load_model(self)
44 ckpt_name = os.path.basename(ckpt.model_checkpoint_path)
45 fname = os.path.join(self.checkpoint_dir, ckpt_name)
---> 46 self.saver.restore(self.sess, fname)
47 print(" [*] Load SUCCESS: %s" % fname)
48 return True
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.pyc in restore(self, sess, save_path)
1558 logging.info("Restoring parameters from %s", save_path)
1559 sess.run(self.saver_def.restore_op_name,
-> 1560 {self.saver_def.filename_tensor_name: save_path})
1561
1562 @staticmethod
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
893 try:
894 result = self._run(None, fetches, feed_dict, options_ptr,
--> 895 run_metadata_ptr)
896 if run_metadata:
897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
1122 if final_fetches or final_targets or (handle and feed_dict_tensor):
1123 results = self._do_run(handle, final_targets, final_fetches,
-> 1124 feed_dict_tensor, options, run_metadata)
1125 else:
1126 results = []
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1319 if handle is None:
1320 return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321 options, run_metadata)
1322 else:
1323 return self._do_call(_prun_fn, self._session, handle, feeds, fetches)
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
1338 except KeyError:
1339 pass
-> 1340 raise type(e)(node_def, op, message)
1341
1342 def _extend_graph(self):
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [6] rhs shape= [4]
[[Node: save/Assign_9 = Assign[T=DT_FLOAT, _class=["loc:@prediction/q/bias"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/q/bias, save/RestoreV2_9)]]
Caused by op u'save/Assign_9', defined at:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
if self.run_code(code, result):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-6320008d113d>", line 19, in <module>
agent = Agent(config, env, sess)
File "dqn/agent.py", line 33, in __init__
self.build_dqn()
File "dqn/agent.py", line 340, in build_dqn
self._saver = tf.train.Saver(self.w.values() + [self.step_op], max_to_keep=30)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1140, in __init__
self.build()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 419, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 155, in restore
self.op.get_shape().is_fully_defined())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 274, in assign
validate_shape=validate_shape)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 43, in assign
use_locking=use_locking, name=name)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [6] rhs shape= [4]
[[Node: save/Assign_9 = Assign[T=DT_FLOAT, _class=["loc:@prediction/q/bias"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/q/bias, save/RestoreV2_9)]]
What is the problem here? Ideally I should be able to load the model no matter which system, or GPU/CPU I use.
I need to be able to resume training breakout-v0 after stopping it. I would also like to be able to move a checkpoint dir to another machine and resume training there.
When I train on my laptop, using ubuntu 14.04, I am able to resume after stopping. But on the faster machine I really want to use, I can not resume after stopping. That machine uses ubuntu 16.04, FWIW.
Both machines use tensorflow 1.3.0. The working laptop uses python 3.6 and the non-working machine uses python 3.5.2. OpenAI gym is version 0.9.4 on both machines, as installed by pip. Neither machine uses GPU, and both use NHWC.
On both machines, I have cloned from the devsisters/DQN-tensorflow repository and manually fixed the bugs that prevent it from working with python 3.x.
`~/DQN-tensorflow$ python main.py --env_name=Breakout-v0 --is_train=True --display=False
[*] GPU : 1.0000
{'_save_step': 500000,
'_test_step': 50000,
'action_repeat': 4,
'backend': 'tf',
'batch_size': 32,
'cnn_format': 'NHWC',
'discount': 0.99,
'display': False,
'double_q': False,
'dueling': False,
'env_name': 'Breakout-v0',
'env_type': 'detail',
'ep_end': 0.1,
'ep_end_t': 1000000,
'ep_start': 1.0,
'history_length': 4,
'learn_start': 50000.0,
'learning_rate': 0.00025,
'learning_rate_decay': 0.96,
'learning_rate_decay_step': 50000,
'learning_rate_minimum': 0.00025,
'max_delta': 1,
'max_reward': 1.0,
'max_step': 50000000,
'memory_size': 1000000,
'min_delta': -1,
'min_reward': -1.0,
'model': 'm1',
'random_start': 30,
'scale': 10000,
'screen_height': 84,
'screen_width': 84,
'target_q_update_step': 10000,
'train_frequency': 4}
WARNING:tensorflow:From /home/mjc/DQN-tensorflow/dqn/agent.py:224: calling argmax (from tensorflow.python.ops.math_ops) with dimension is deprecated and will be removed in a future version.
Instructions for updating:
Use the axis
argument instead
WARNING:tensorflow:From /opt/anaconda/miniconda3/envs/tfbuild/lib/python3.5/site-packages/tensorflow/python/util/tf_should_use.py:107: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use tf.global_variables_initializer
instead.
[*] Loading checkpoints...
[!] Load FAILED: checkpoints/Breakout-v0/backend-tf/ep_end-0.1/model-m1/screen_width-84/env_type-detail/learning_rate-0.00025/learning_rate_minimum-0.00025/memory_size-1000000/env_name-Breakout-v0/dueling-False/learning_rate_decay-0.96/batch_size-32/min_delta--1/max_reward-1.0/learn_start-50000.0/double_q-False/max_delta-1/scale-10000/random_start-30/cnn_format-NHWC/discount-0.99/min_reward--1.0/action_repeat-4/learning_rate_decay_step-50000/ep_start-1.0/history_length-4/target_q_update_step-10000/ep_end_t-1000000/train_frequency-4/max_step-50000000/screen_height-84/
`
How can this problem be fixed?
Hi, I was trying to run the DQN code. when it iterated 50000 steps, an error [!] Load FAILED happened. According to the error information. CPU only supports data format "NHWC", but the code executed by gpu with data format "NCHW". Thus, I want to know how to execute gpu with "NCHW" and save or load cpu with "NHWC" to avoid this error . THX!!
code modified by me following(not work):
def save_model(self, step=None):
print(" [*] Saving checkpoints...")
**self.config.cnn_format = "NHWC"**
print("******** save begin data_formate %s", self.config.cnn_format);
model_name = type(self).__name__
if not os.path.exists(self.checkpoint_dir):
os.makedirs(self.checkpoint_dir)
self.saver.save(self.sess, self.checkpoint_dir, global_step=step)
**self.config.cnn_format = "NCHW"**
print("******** save end data_formate %s", self.config.cnn_format);
def load_model(self):
print(" [*] Loading checkpoints...")
**self.config.cnn_format = "NHWC"**
print("******** load begin data_formate %s", self.config.cnn_format);
ckpt = tf.train.get_checkpoint_state(self.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
ckpt_name = os.path.basename(ckpt.model_checkpoint_path)
fname = os.path.join(self.checkpoint_dir, ckpt_name)
self.saver.restore(self.sess, fname)
print(" [*] Load SUCCESS: %s" % fname)
**self.config.cnn_format = "NCHW"**
print("******** load end data_formate %s", self.config.cnn_format);
return True
else:
print(" [!] Load FAILED: %s" % self.checkpoint_dir)
**self.config.cnn_format = "NCHW"**
print("******** load end data_formate %s", self.config.cnn_format);
return False
when i try to run
python main.py --env_name=Breakout-v0 --is_train=True
i get this error
Traceback (most recent call last):
File "main.py", line 4, in
from dqn.agent import Agent
File "/home/dahandla/DQN-tensorflow-master/dqn/agent.py", line 10, in
from .replay_memory import ReplayMemory
File "/home/dahandla/DQN-tensorflow-master/dqn/replay_memory.py", line 8, in
from utils import save_npy, load_npy
ImportError: No module named 'utils'
any help would be appreciated
Seems like in assets/tensorboard_160516.png, the Tensorboard displayed metrics such as average loss. I'm unable to see them in Tensorflow 0.11rc, could it be because of the tensorflow version? Which version of tensorflow has this been run on?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.