Git Product home page Git Product logo

Comments (9)

mwydmuch avatar mwydmuch commented on July 17, 2024 1

Thanks a lot for such a quick response. So if another ATTACK is called when the previous ATTACK's effect has not been reflected, the new ATTACK is simply discarded and not put in some queue right?

Yes, that's how the Doom game works. If you press the ATTACK again before the animation of the previous one ends, it will be ignored.

I pass skip_frame for tics, which has value 5. This has the effect of applying the given action(lets say ATTACK) for 1st tic and rest 4 tics, no action is taken, am I correct ? please correct me if I am wrong.

No, the action is applied for all 5 tics. It corresponds to the player pressing the same buttons for all 5 frames. So if you do ATTACK, and set skip_frame to 30, you will notice that the ammo count will be reduced by 2 in the next available state, meaning two shots were fired during that time, since one pistol shot takes ~14/15 frames.

So basically:

make_action([1, 0, 0], tics=30)

have the same effect as:

for _ in range(30):
    make_action([1, 0, 0], tics=1)

expect that in the first case, the screen buffer is rendered only once, and in the second case, all 29 states in between will be rendered and made available to the agent, so the first one is more efficient.

There are some small exceptions to that rule, the following buttons will be "pressed once" in the first tic:

  • SELECT_NEXT_WEAPON
  • SELECT_PREV_WEAPON
  • DROP_SELECTED_WEAPON
  • ACTIVATE_SELECTED_ITEM
  • SELECT_NEXT_ITEM
  • SELECT_PREV_ITEM
  • DROP_SELECTED_ITEM

This made more sense to use when we were creating ViZDoom. However, these are not used in any of the default scenarios.

Also, thank you for your questions; I will try to improve that part of the documentation to make it more clear!

from vizdoom.

Acejoy avatar Acejoy commented on July 17, 2024 1

Yes, for doom_skill=1 to 4 , i noticed this behaviour (simply sprinting towards vest and no shooting) while struggling in doom_skill=5

from vizdoom.

mwydmuch avatar mwydmuch commented on July 17, 2024 1

I'm closing this issue as I believe, for a moment, I answered all your questions, and there is no issue with the library. Good luck with training your models!

from vizdoom.

mwydmuch avatar mwydmuch commented on July 17, 2024

Hi @Acejoy! I haven't yet found time to check your code carefully. But I see that you are using skip_frame = 5, while the animation for the attack with the pistol (the weapon available in this scenario) is around 14-15 frames. So, if there are two attacks close to each other, the second won't have any effect, as the animation of the previous attack is still playing. I think this is what I'm observing in the log you included.

from vizdoom.

Acejoy avatar Acejoy commented on July 17, 2024

Thanks a lot for such a quick response. So if another ATTACK is called when the previous ATTACK's effect has not been reflected, the new ATTACK is simply discarded and not put in some queue right?

Also, I need a small clarification regarding make_action() command(from documentation):

make_action(self: vizDoom.DoomGame, action: object, tics: int = 1) → float

in above code snippet , I pass skip_frame for tics, which has value 5. This has the effect of applying the given action(lets say ATTACK) for 1st tic and rest 4 tics, no action is taken, am I correct ? please correct me if I am wrong.

Thanks again.

from vizdoom.

Acejoy avatar Acejoy commented on July 17, 2024

So, for training a RL agent it would make more sense to use tic=15 in general for the deadly_cooridor scenario because all other actions except ATTACK require less tics for effect to be reflected ? Because otherwise, the reward for make_action wouldn't be representative of the action taken.

ALso, can you point to the page where such information is present( regarding how much tics each action requires).

Also, do you have any tips to train an agent in this scenario,I tried stable_baselines3's algorithms along with curriculum learning(using doom_skill variable in cfg).

Thanks for the reply

from vizdoom.

mwydmuch avatar mwydmuch commented on July 17, 2024

So, for training a RL agent it would make more sense to use tic=15 in general for the deadly_cooridor scenario because all other actions except ATTACK require less tics for effect to be reflected ? Because otherwise, the reward for make_action wouldn't be representative of the action taken.

No, when training the agent, we want to find a good tradeoff between being able to be responsive to the environment and reducing the number of states to improve the learning speed. Doom has 35 tics per 1 second of in-game time; 15 tics means that the agent is only observing and changing its cations 2 times per second, which is not very responsive. The effect of the ATTACK taking more than one state is not a problem, this is a core idea behind RL, to learn the delayed consequences of the actions. In ViZDoom, we recommend using frame skip = 3 or 4, this already speeds up the training a lot compared to learning with no frame skip and is responsive enough for most cases.

ALso, can you point to the page where such information is present( regarding how much tics each action requires).

We don't have such a page, sorry. Different weapons have different animations/attack speeds etc. Movement is immediate. However, there is some acceleration/momentum physics in Doom, so keeping buttons pressed for longer makes players move faster. All of these timings can be modified using scripting language, so I think it doesn't make sense to document.

Also, do you have any tips to train an agent in this scenario,I tried stable_baselines3's algorithms along with curriculum learning(using doom_skill variable in cfg).

Not really, as this is still a very simple scenario. All the algorithms from stable_baselines3 should have no problem with it. I just updated our stabe_baselines3 example to work with the current version and gymnasium wrapper:
https://github.com/Farama-Foundation/ViZDoom/blob/master/examples/python/learning_stable_baselines3.py

You can check it for delay corridor by running: python learning_stable_baselines3.py --env VizdoomCorridor-v0 For PPO it needs around 200k steps to get good scores . As you can check, the only trick used is downscaling the screen buffer to just 80x60 pixels. The frame skip used is equal to 4.

from vizdoom.

Acejoy avatar Acejoy commented on July 17, 2024

I tried stable_baselines3 PPO policy along with curriculum learning, where I trained the agent with doom_skill=1 to 5. It didn't train well, it was able to reach the end by sprinting but didn't shoot the adversaries for some reason.

Anyways I will try again.

Thanks for the guidance.

from vizdoom.

mwydmuch avatar mwydmuch commented on July 17, 2024

I think curriculum learning is not needed here as it's a simple environment. On doom_skill level < 5, enemies don't make enough damage to kill the player before it gets the vest, so there is no need to kill them. This may actually slow down the discovery of the optimal policy for doom_skill level=5.

from vizdoom.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.