Git Product home page Git Product logo

Comments (6)

ikostrikov avatar ikostrikov commented on August 28, 2024

Do you mean this commit?

pytorch/examples@5c41070#diff-054e0153cf9e86773f4d272224dd1d13

from pytorch-a3c.

edbeeching avatar edbeeching commented on August 28, 2024

Yes, but I see you call the ensure_shared_grads function after backprop but before the optimizer step, is this still required? To be honest I am still a bit surprised / impressed that asynchronous updates are so straight forward in PyTorch. Essentially I am implementing a A3C RL agent and want to be sure that all processes are updating the global model, I was following the Hogwild example and your code but I noticed this difference.

Cheers

from pytorch-a3c.

ikostrikov avatar ikostrikov commented on August 28, 2024

When I remove these lines it stops working for me. Probably, it requires to change something else.

from pytorch-a3c.

edbeeching avatar edbeeching commented on August 28, 2024

I think these lines in train are no longer required:
24 model = ActorCritic(env.observation_space.shape[0], env.action_space)
38 model.load_state_dict(shared_model.state_dict())

and also change all instances of model to shared_model

At least that is how it appears to work for the Hogwild example.
It is tricky because this would mean that during an episode the model can potentially change its parameters due to updates from the other processes.

Perhaps the original approach is better as there are advantages and disadvantages to the changes (assuming they would work).

Sorry for the hassle!

from pytorch-a3c.

ikostrikov avatar ikostrikov commented on August 28, 2024

These lines are important because we need to keep parameters during roll outs.

Probably, I will have to spend more time trying to figure out what they changed.

In meanwhile I would highly recommend to use A2C/PPO/ACKTR. In my experience, it works just as well (except some cases).

from pytorch-a3c.

edbeeching avatar edbeeching commented on August 28, 2024

Ok thank you, I will defer to your expertise and try A2C for the moment.
I have a more general question if you have the time, why do you think it is better to keep the parameters the same during the rollout? Because of the GAE?

from pytorch-a3c.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.