laszukdawid / ai-traineree Goto Github PK

View Code? Open in Web Editor NEW

25.0 3.0 6.0 1.34 MB

PyTorch agents and tools for (Deep) Reinforcement Learning

Home Page: http://ai-traineree.readthedocs.io/

License: Apache License 2.0

Python 100.00%

ppo reinforcement-learning artificial-intelligence-algorithms agents ddpg rainbow dqn-pytorch pytorch multi-agents deep

ai-traineree's People

Contributors

Stargazers

Watchers

Forkers

fokx johnnydankseed francoisgergaud peaky2222 luisfmcuriel standardgalactic

ai-traineree's Issues

'int' object has no attribute 'to_feature' in DQN examples

Hi,

In the documentation, https://ai-traineree.readthedocs.io/en/latest/examples.html I got the error 'int' object has no attribute 'to_feature' when I try to run the example DQN on CartPole. I believe is because I am calling the wrong task attribute (not sure about this). Changing this agent = DQNAgent(task.obs_size, task.action_size, n_steps=5) to agent = DQNAgent(task.obs_space, task.action_space, n_steps=5) makes it work perfectly.

Regards!

AgentFactory: Rainbow agent creationg

Update the AgentFactory so that it can create a Rainbow agent from provided agent_state.

Serialize agents

PPO on MultiEnvRunner doesn't work properly

Problem

Example lunar_lander_ppo_multi.py doesn't seem to converge and its results are sub-optimal.

Expected

Expected to have super-duper performance. The more agents the better everything, right?

bugfix examples/multi_agent/prison_iql.py

Hi, great framework!

Seems to me that the following lines need should be changed in prison_iql.py

#8: (version 2 is deprecated)

from pettingzoo.butterfly import prison_v3 as prison

#14: to get 1 instead of (1,)

obs_size = ma_task.obs_size[0]

Test: Reloaded DDPG agent doesn't differ after data feeding

Serialize networks

Unity Environment

Hey, I am glad I ran into your repository from the mlagents threads.

How could I use your project to import my unity environment?
I see you do

task = GymTask('CartPole-v1)

I want to test different RL algorithms on my unity env.

Dummy agent

What

Create an agent which returns random values according to specifications.

Why

Mainly for testing and debugging. When running experiments there's plenty of things that'll be broken. Having agent that returns specific values we can debug where the problem is, i.e. whether that's agent, environment, life cycle management...

Serialize buffers

Unify epislon-greedy

What

Each agent has built-in epsilon greedy mechanism. There's likely little need to have the same code everywhere so it should be moved to a single place.

Consideration

As it is right now, and as it is intended, each agent needs to touch (agent.act) state to provides an action. This touch can be related to increment some internal counter or producing additional values, e.g. entropy. Only touched state (all data tuple) is used in step to learn something.

What is the best way to reshape the observation before feedint it to the network?

Hi Dawid,

I am trying to do an example for Sneks. However, the default network (FcNet) only accepts 1 dim as an input shape. In the game, we have an observation shape of 16x16x3. I want to flatten the input before feeding the observation to the network, but I want to keep the spatial information of the state. Therefore I am using the ConvNe with a Flatten layer before feeding it to FcNet.

def network_fn(state_dim, output_dim, device):
    conv_net = ConvNet(state_dim, hidden_layers=(10,10), device=device)
    return NetChainer(
        net_classes=[
            conv_net,
            nn.Flatten(),
            FcNet(conv_net.output_size, output_dim, hidden_layers=(100, 100, 50), device=device),
        ]
    )

First I want to ask; Is it necessary to reshape the observation from 16x16x3 to 1x3x16x16 every time I will compute the logits? If it is, what is the best way in ai-traineree to do this? Creating another Net for this?

  class ReshapeNet(NetworkType):
    def __init__(self, shape) -> None:
        super(ReshapeNet, self).__init__()
        self.shape = shape

    def forward(self, x):
        return torch.reshape(x, self.shape)

Thanks.

Curiousity in PPO

What

Add Curiosity driven exploration to PPO.

Why

It's been shown [citation needed] that Curiosity improves agents' performance on sparse reward environments.

Assertion failed for multi-agent environment

Hi,
I am working with the example code for the training of multi-agent env. However, when I create each agent, the expected input for the network is one dimension assert len(in_features) == 1 (see here for FcNet), but my observation space has 3 dimensions: DataSpace(dtype='int8', shape=(6, 7, 2), low=0, high=1). How can I make this work? This network is not available for this environment? Should I flatten the input?

Thanks.

Unify input data

What

Pass data to agents using a dictionary, or dataclass.

Why

Currently all agents expect (state, action, reward, done, next_state) tuple for stepping. This is likely all that's needed in most cases, however, in some there isn't a need for next_state or done, and in others we might want to provide additional values, e.g. entropy.

laszukdawid / ai-traineree Goto Github PK

ai-traineree's People

Contributors

Stargazers

Watchers

Forkers

ai-traineree's Issues

Problem

Expected

What

Why

What

Consideration

What

Why

What

Why

Recommend Projects

Recommend Topics

Recommend Org