Comments (4)
The first approach is faster on CPU in practice (no need to do a second pass, no performance gain from using batches).
However, the second one can be faster for GPU training. I implemented the second approach here:
https://github.com/ikostrikov/pytorch-a2c-ppo-acktr
from pytorch-a3c.
Thanks, that seems to be reasonable. Currently, I am using Macbook Pro with i7 processor with 4 cores and I don't have any GPU available so that's why I am trying experimenting with A3C. I will try to implement first approach. Also, is it possible to use PPO algorithm with this asynchronous settings? I didn't read about PPO much and just wonder if it is possible. Also do you think PyTorch is good fit for implementing CPU reinforcement algorithms or are there any better? I like the way how PyTorch works and is easy to use.
from pytorch-a3c.
4 cores is fine for mujoco but not enough for atari (unless you are ready to wait for several days to get results).
Synchronous methods are much easier to debug and they usually perform just as well as asynchronous methods. See the OpenAI blogpost about a2c/acktr where they mention that.
In terms of the frameworks, the difference between all the frameworks nowadays is marginal (in terms of time you need to spend in different frameworks to implement exactly the same thing). It's usually +-1-2 hours between TensorFlow and PyTorch depending on a specific task. But PyTorch tends to work slightly faster than TensorFlow, especially on CPU. So if performance is the major concern then I would recommend PyTorch.
from pytorch-a3c.
Thanks for you time, it helps a lot. I will not use Atari nor Mujoco. I am trying to implement my own environment based on descrete grid and experiment with it, so hopefully my processor is enough.
from pytorch-a3c.
Related Issues (20)
- gradient share problem HOT 1
- GAE parameter name should be lambda not tau. And why is default 1.0? HOT 4
- What's the difference between environment 'Pong-v4' and 'PongDeterministic-v4'
- Reward Smoothing
- Multi-processing or multi-threading HOT 1
- The while True loop of function train?
- NotImplementedError HOT 6
- [Question] Does a2c support distributed processing?
- Question in train.py
- with respect to how to choose an action
- How does A3C aggregate the model from different learner? HOT 1
- Why do we reverse rewards? HOT 1
- Dependency list not provided (environment.yml file)
- Stuck in 'p.join()' HOT 1
- After some steps, all the NNs always output same action HOT 1
- Scepticism about the correctness of the use of the LSTMCell
- Can you provide the python, pytorch, numpy and other versions used in the project?
- TypeError: tuple indices must be integers or slices, not tuple
- if there's no "if shared_param.grad is not None: return" what will happen? HOT 1
- where see the result?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-a3c.