Hey all, I am trying to train an agent for the deadly corridor scenario. I am using fo

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

ATTACK command not working properly, The SELECTED_WEAPON_AMMO variable does not change when pressed about vizdoom HOT 9 CLOSED

Acejoy commented on July 17, 2024

ATTACK command not working properly, The SELECTED_WEAPON_AMMO variable does not change when pressed

from vizdoom.

Comments (9)

mwydmuch commented on July 17, 2024 1

Thanks a lot for such a quick response. So if another ATTACK is called when the previous ATTACK's effect has not been reflected, the new ATTACK is simply discarded and not put in some queue right?

Yes, that's how the Doom game works. If you press the ATTACK again before the animation of the previous one ends, it will be ignored.

I pass skip_frame for tics, which has value 5. This has the effect of applying the given action(lets say ATTACK) for 1st tic and rest 4 tics, no action is taken, am I correct ? please correct me if I am wrong.

No, the action is applied for all 5 tics. It corresponds to the player pressing the same buttons for all 5 frames. So if you do ATTACK, and set skip_frame to 30, you will notice that the ammo count will be reduced by 2 in the next available state, meaning two shots were fired during that time, since one pistol shot takes ~14/15 frames.

So basically:

make_action([1, 0, 0], tics=30)

have the same effect as:

for _ in range(30):
    make_action([1, 0, 0], tics=1)

expect that in the first case, the screen buffer is rendered only once, and in the second case, all 29 states in between will be rendered and made available to the agent, so the first one is more efficient.

There are some small exceptions to that rule, the following buttons will be "pressed once" in the first tic:

SELECT_NEXT_WEAPON
SELECT_PREV_WEAPON
DROP_SELECTED_WEAPON
ACTIVATE_SELECTED_ITEM
SELECT_NEXT_ITEM
SELECT_PREV_ITEM
DROP_SELECTED_ITEM

This made more sense to use when we were creating ViZDoom. However, these are not used in any of the default scenarios.

Also, thank you for your questions; I will try to improve that part of the documentation to make it more clear!

from vizdoom.

Acejoy commented on July 17, 2024 1

Yes, for doom_skill=1 to 4 , i noticed this behaviour (simply sprinting towards vest and no shooting) while struggling in doom_skill=5

from vizdoom.

mwydmuch commented on July 17, 2024 1

I'm closing this issue as I believe, for a moment, I answered all your questions, and there is no issue with the library. Good luck with training your models!

from vizdoom.

mwydmuch commented on July 17, 2024

Hi @Acejoy! I haven't yet found time to check your code carefully. But I see that you are using skip_frame = 5, while the animation for the attack with the pistol (the weapon available in this scenario) is around 14-15 frames. So, if there are two attacks close to each other, the second won't have any effect, as the animation of the previous attack is still playing. I think this is what I'm observing in the log you included.

from vizdoom.

Acejoy commented on July 17, 2024

Thanks a lot for such a quick response. So if another ATTACK is called when the previous ATTACK's effect has not been reflected, the new ATTACK is simply discarded and not put in some queue right?

Also, I need a small clarification regarding make_action() command(from documentation):

make_action(self: vizDoom.DoomGame, action: object, tics: int = 1) → float

in above code snippet , I pass skip_frame for tics, which has value 5. This has the effect of applying the given action(lets say ATTACK) for 1st tic and rest 4 tics, no action is taken, am I correct ? please correct me if I am wrong.

Thanks again.

from vizdoom.

Acejoy commented on July 17, 2024

So, for training a RL agent it would make more sense to use tic=15 in general for the deadly_cooridor scenario because all other actions except ATTACK require less tics for effect to be reflected ? Because otherwise, the reward for make_action wouldn't be representative of the action taken.

ALso, can you point to the page where such information is present( regarding how much tics each action requires).

Also, do you have any tips to train an agent in this scenario,I tried stable_baselines3's algorithms along with curriculum learning(using doom_skill variable in cfg).

Thanks for the reply

from vizdoom.

mwydmuch commented on July 17, 2024

So, for training a RL agent it would make more sense to use tic=15 in general for the deadly_cooridor scenario because all other actions except ATTACK require less tics for effect to be reflected ? Because otherwise, the reward for make_action wouldn't be representative of the action taken.

No, when training the agent, we want to find a good tradeoff between being able to be responsive to the environment and reducing the number of states to improve the learning speed. Doom has 35 tics per 1 second of in-game time; 15 tics means that the agent is only observing and changing its cations 2 times per second, which is not very responsive. The effect of the ATTACK taking more than one state is not a problem, this is a core idea behind RL, to learn the delayed consequences of the actions. In ViZDoom, we recommend using frame skip = 3 or 4, this already speeds up the training a lot compared to learning with no frame skip and is responsive enough for most cases.

ALso, can you point to the page where such information is present( regarding how much tics each action requires).

We don't have such a page, sorry. Different weapons have different animations/attack speeds etc. Movement is immediate. However, there is some acceleration/momentum physics in Doom, so keeping buttons pressed for longer makes players move faster. All of these timings can be modified using scripting language, so I think it doesn't make sense to document.

Also, do you have any tips to train an agent in this scenario,I tried stable_baselines3's algorithms along with curriculum learning(using doom_skill variable in cfg).

Not really, as this is still a very simple scenario. All the algorithms from stable_baselines3 should have no problem with it. I just updated our stabe_baselines3 example to work with the current version and gymnasium wrapper:
https://github.com/Farama-Foundation/ViZDoom/blob/master/examples/python/learning_stable_baselines3.py

You can check it for delay corridor by running: python learning_stable_baselines3.py --env VizdoomCorridor-v0 For PPO it needs around 200k steps to get good scores . As you can check, the only trick used is downscaling the screen buffer to just 80x60 pixels. The frame skip used is equal to 4.

from vizdoom.

Acejoy commented on July 17, 2024

I tried stable_baselines3 PPO policy along with curriculum learning, where I trained the agent with doom_skill=1 to 5. It didn't train well, it was able to reach the end by sprinting but didn't shoot the adversaries for some reason.

Anyways I will try again.

Thanks for the guidance.

from vizdoom.

mwydmuch commented on July 17, 2024

I think curriculum learning is not needed here as it's a simple environment. On doom_skill level < 5, enemies don't make enough damage to kill the player before it gets the vest, so there is no need to kill them. This may actually slow down the discovery of the optimal policy for doom_skill level=5.

from vizdoom.

ATTACK command not working properly, The SELECTED_WEAPON_AMMO variable does not change when pressed about vizdoom HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent