The pcl_keras from rarilurelo

A problem with optimization equation

In the equation

C = K.sum(-v_s_t + self.gamma ** self.rollout_d * v_s_t_d + \
                K.sum(self.R, axis=1) - self.entropy_tau * K.sum(self.discount * \
                K.sum(K.log(self.pi+K.epsilon()) * self.action, axis=2), axis=1), axis=0)

self.rollout_d is a constant. But when the entered portion of episode in feed_dict is smaller than self.rollout_d, the last state will occur before self.rollout_d. So we should modify the above equation as following:

C = K.sum(-v_s_t + self.gamma ** tf.cast(tf.shape(self.state)[1], tf.float32) * v_s_t_d + \
                K.sum(self.R, axis=1) - self.entropy_tau * K.sum(self.discount * \
                K.sum(K.log(self.pi+K.epsilon()) * self.action, axis=2), axis=1), axis=0)

We have to type cast otherwise Tensorflow will give error.

Different net (adding Conv2D) and input_shape

Hi there!

Thanks for the repo, it's helping me understand the paper and how to use with different envs.

While using/understanding, I tried to use a Snake Game env, with shape = (board_size, board_size). I could just flatten() the state, but I'd like to use Conv2D to receive inputs and only in the pi/value models the FC layers. If I set the code like this:

self.state = tf.placeholder(tf.float32, shape=[None, None, 100], name='state')
self.R = tf.placeholder(tf.float32, shape=[None, None], name='R')
self.action = tf.placeholder(tf.float32, shape=[None, None, 5], name='action')
self.discount = tf.placeholder(tf.float32, shape=[None], name='discount')

v_s_t = v_model(self.state[:, 0, :])
v_s_t_d = v_model(self.state[:, -1, :])
self.pi = pi_model(self.state)

It's working as intended (state is flattened). But if I were to use:

class Net(object):
    def __init__(self, board_size):
        model = Sequential()
        model.add(Conv2D(32, (1, 1), input_shape = (1, 10, 10)))
        model.add(Conv2D(64, (1, 3)))
        model.add(Flatten())
        # model.add(Dense(50, activation='relu', input_dim=(board_size**2)))
        # model.add(Dense(50, activation='relu'))
        self.pi_model = Sequential([model])
        self.pi_model.add(Dense(50, activation='relu'))
        self.pi_model.add(Dense(5, activation='softmax'))
        self.v_model = Sequential([model])
        self.v_model.add(Dense(50, activation='relu'))
        self.v_model.add(Dense(1))

self.state = tf.placeholder(tf.float32, shape=[None, None, 10, 10], name='state')
self.R = tf.placeholder(tf.float32, shape=[None, None], name='R')
self.action = tf.placeholder(tf.float32, shape=[None, None, 5], name='action')
self.discount = tf.placeholder(tf.float32, shape=[None], name='discount')

v_s_t = v_model(self.state[:, 0, :])
v_s_t_d = v_model(self.state[:, -1, :])
self.pi = pi_model(self.state)

I receive this output:

Traceback (most recent call last):
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 670, in merge_with
    self.assert_same_rank(other)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 715, in assert_same_rank
    other))
ValueError: Shapes (1, 1, 10, 32) and (?, ?, ?, ?, ?) must have the same rank

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 745, in with_rank
    return self.merge_with(unknown_shape(ndims=rank))
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 676, in merge_with
    raise ValueError("Shapes %s and %s are not compatible" % (self, other))
ValueError: Shapes (1, 1, 10, 32) and (?, ?, ?, ?, ?) are not compatible

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Arquivos\GitHub\SnakeAI\models\pcl.py", line 196, in <module>
    agent.train()
  File "D:\Arquivos\GitHub\SnakeAI\models\pcl.py", line 143, in train
    episode = self.rollout()
  File "D:\Arquivos\GitHub\SnakeAI\models\pcl.py", line 114, in rollout
    a, agent_info = self.get_action(s)
  File "D:\Arquivos\GitHub\SnakeAI\models\pcl.py", line 134, in get_action
    self.build()
  File "D:\Arquivos\GitHub\SnakeAI\models\pcl.py", line 70, in build
    self.pi = pi_model(self.state)
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 457, in __call__
    output = self.call(inputs, **kwargs)
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\engine\network.py", line 570, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\engine\network.py", line 724, in run_internal_graph
    output_tensors = to_list(layer.call(computed_tensor, **kwargs))
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\engine\network.py", line 570, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\engine\network.py", line 724, in run_internal_graph
    output_tensors = to_list(layer.call(computed_tensor, **kwargs))
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\layers\convolutional.py", line 168, in call
    dilation_rate=self.dilation_rate)
  File "C:\Users\victo\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 3565, in conv2d
    data_format=tf_data_format)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 779, in convolution
    data_format=data_format)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 856, in __init__
    data_format=data_format)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 439, in __init__
    self.call = build_op(num_spatial_dims, padding)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 865, in _build_op
    name=self.name)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 136, in __init__
    filter_shape = filter_shape.with_rank(input_shape.ndims)
  File "C:\Users\victo\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 747, in with_rank
    raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (1, 1, 10, 32) must have rank 5

Could you help me understand how to make this change?

Regards,
Victor.

rarilurelo / pcl_keras Goto Github PK

pcl_keras's People

Contributors

Stargazers

Watchers

Forkers

pcl_keras's Issues

test

A problem with optimization equation

test2

tyest

Different net (adding Conv2D) and input_shape

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent