Investigate why Pac-Man isn't learning with basic Q-learning about multiagent-rl HOT 5 CLOSED

matheusportela commented on June 3, 2024

Investigate why Pac-Man isn't learning with basic Q-learning

from multiagent-rl.

Comments (5)

matheusportela commented on June 3, 2024

Apparently, even in cart pole scenario, the results aren't being consistent, since testing games score much less than learning ones.

Some possible causes:

Wrong implementation of Q-learning
State space too large
Insufficient information in state
Misleading reward function
Improper learning rate and discount rate values

from multiagent-rl.

Skalwalker commented on June 3, 2024

Could this be happening to ghost agents too?

from multiagent-rl.

matheusportela commented on June 3, 2024

@Skalwalker yes, it could, although my guess is that the state space is too large for the scenario where the Pac-Man is alone in the field (which I used to run this test). When testing with the cart-pole scenario, the agent could start learning something only after I drastically reduced the state space to a couple hundred possible states.

from multiagent-rl.

matheusportela commented on June 3, 2024

Small update on this task: Q-learning is working quite well with the cart-pole experiment. After about 500 simulations, the agent learn to control the inverted pendulum for ~10 seconds and, 500 simulations later, it takes minutes until the pole falls.

The Pac-Man scenario with simple Q-learning doesn't show the same progress though. I've tried to reduce the state space by generating states that incorporate only three aspects:

X coordinate
Y coordinate
Whether this is the first time the agent is visiting this cell

I've just run 500 simulations and the agent did seem to have some progress about 300 games later, but it suddenly returns to the usual position.

Based on these info, I'll try and review the reward function and run the Pac-Man simulation with different parameters. Without better results, I'll try to put some ghost information on the state (but aware that it might actually reduce the learning velocity, since the state space is going to enlarge).

from multiagent-rl.

matheusportela commented on June 3, 2024

One more thing to test: simply selecting behaviors instead of actions.

from multiagent-rl.

Recommend Projects

Investigate why Pac-Man isn't learning with basic Q-learning about multiagent-rl HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent