Git Product home page Git Product logo

Comments (14)

shuoye1000 avatar shuoye1000 commented on May 26, 2024 2

Through your tireless guidance, I finally solved the problem, thank you very much for your patience! Other than that, I still found some other minor issues, it seems that the following are missing from our code: Humanoid experiment data (parameters, expert data, etc.), Breakout expert data, Space Invaders expert data. If you could upload these data, it would help me to reproduce your paper better. And if the file is too large, zip seems to be a good option. You are a very charismatic author, good luck with your work!
OBH@X8)JX}IB06ZR%ODD2

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024 2

Hi, the link to the Atari datasets is fixed. For Humanoid, the expert data got accidentally deleted, and I will try to retrain a SAC agent if I can find time on the weekend to collect new expert trajectories.

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

Hi, please set agent.learn_temp=False and it should work. In my experience, SAC style temperature learning is not very stable with IQ-Learn and can prevent the method from converging. I pushed a change where the temperature learning is disabled by default.

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

Here are some training results I ran yesterday with 1 and 10 expert demos, with the hyperparams in run_mujoco.sh with the temp learning disabled:

W B Chart 4_14_2022, 11_04_56 AM

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

Imp Update: I found that temperature learning can also work well if the SAC target_entropy param is set to much lower than −dim(A), as we don't need much exploration for IL settings. For Half-Cheetah, setting the target_entropy=-24 works well. So empirically 4 * -dim(A) could be a good value to try with IQ-Learn

Reward Curves:
W B Chart 4_14_2022, 5_09_19 PM

from iq-learn.

shuoye1000 avatar shuoye1000 commented on May 26, 2024

关于您给出的建议,我重新下载了您的代码并且重新运行了脚本,这是我使用的命令:【python train_iq.py env=cheetah agent=sac expert.demos=10 method.loss=value method.regularize=True agent.actor_lr=3e-05 seed=0 agent.learn_temp=False】
This is the output of my console parameters.
image
image
I'm too clumsy, whether the obs_dim and action_dim of the agent need to be modified by myself ?
And here are the training results I've gotten so far:
image
I would be very grateful for your guidance!

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

Hi, the obs_dim and action_dim are set automatically by the code and you can ignore them. The default SAC agent temperature in the project repo was too high and I pushed a fix setting it to 1e-2 . For the HalfCheetah environment, both 1e-2 or 1e-3 temperature values should work very well.

from iq-learn.

BepfCp avatar BepfCp commented on May 26, 2024

Hi, thanks to your work and patient guidance. Just a small question, why dose your SAC on HalfCheetah-v2 only get 5000 points? From my own experience, SAC can reach to at least 12000 points within 1M steps. See similar results from OpenAI's SpinningUp or the picture below.
image

from iq-learn.

liushunyu avatar liushunyu commented on May 26, 2024

Hi, the link to the Atari datasets is fixed. For Humanoid, the expert data got accidentally deleted, and I will try to retrain a SAC agent if I can find time on the weekend to collect new expert trajectories.

HI, I cannot find Space Invaders expert data in the link

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

For the half-cheetah expert, we may have trained with a single critic, instead of double critics so performance is not as high. (I don't know exactly why we did it, but the baselines we compared against like ValueDICE had similar expert performance so we likely didn't optimize too much). Nevertheless, a better expert should easily translate to better imitation with IQ-Learn, as it easily saturates the current expert performance at ~5000 level

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

HI, I cannot find Space Invaders expert data in the link

Sorry for missing SpaceInvaders in the dataset release, I will add it to the GDrive.

from iq-learn.

liushunyu avatar liushunyu commented on May 26, 2024

HI, I cannot find Space Invaders expert data in the link

Sorry for missing SpaceInvaders in the dataset release, I will add it to the GDrive.

Thanks!!! By the way, can I run SQIL based on your code? I find the "sqil.yaml" in the "iq_learn/conf/method". How can I run it? Moreover, do you provide the GAIL implementation using the same expert data as you?

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

We used this repo for running GAIL: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail. You can add our data-loading setup to it to use our provided expert demos. SQIL code was removed in the newer commits from our repo but is very simple to implement, and you can check an old version of the codebase

from iq-learn.

Div99 avatar Div99 commented on May 26, 2024

I have added the expert datasets along with IQ-Learn results on Humanoid-v2 along. Also released a script to generate your expert trajectories for new environments

from iq-learn.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.