Comments (5)
Apparently, even in cart pole scenario, the results aren't being consistent, since testing games score much less than learning ones.
Some possible causes:
- Wrong implementation of Q-learning
- State space too large
- Insufficient information in state
- Misleading reward function
- Improper learning rate and discount rate values
from multiagent-rl.
Could this be happening to ghost agents too?
from multiagent-rl.
@Skalwalker yes, it could, although my guess is that the state space is too large for the scenario where the Pac-Man is alone in the field (which I used to run this test). When testing with the cart-pole scenario, the agent could start learning something only after I drastically reduced the state space to a couple hundred possible states.
from multiagent-rl.
Small update on this task: Q-learning is working quite well with the cart-pole experiment. After about 500 simulations, the agent learn to control the inverted pendulum for ~10 seconds and, 500 simulations later, it takes minutes until the pole falls.
The Pac-Man scenario with simple Q-learning doesn't show the same progress though. I've tried to reduce the state space by generating states that incorporate only three aspects:
- X coordinate
- Y coordinate
- Whether this is the first time the agent is visiting this cell
I've just run 500 simulations and the agent did seem to have some progress about 300 games later, but it suddenly returns to the usual position.
Based on these info, I'll try and review the reward function and run the Pac-Man simulation with different parameters. Without better results, I'll try to put some ghost information on the state (but aware that it might actually reduce the learning velocity, since the state space is going to enlarge).
from multiagent-rl.
One more thing to test: simply selecting behaviors instead of actions.
from multiagent-rl.
Related Issues (20)
- Re-structure modules
- Make simulation.py stop with CTRL-C
- Use absolute imports instead of changing the modules path
- Refactor controller to be experiment agnostic HOT 3
- Separate learn from action selection HOT 1
- Execute agent action selection in different threads
- Make simulation.py accept the same CLI arguments as adapter.py
- Create README.md for Pac-Man experiment HOT 1
- Remove all hardcoded imports to pacman package
- Create another reinforcement learning experiment HOT 1
- Fix Q-learning in windy experiment HOT 1
- Create setup.py to distribute as a package
- Update wiki after refactoring
- Use numpy in Q-learning
- Implement policy save and load in Windy World
- Remove package metadata
- Port to Python 3 HOT 1
- Use inproc instead of TCP for ZMQ
- undergraduate thesis HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from multiagent-rl.