jannerm / trajectory-transformer Goto Github PK
View Code? Open in Web Editor NEWCode for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
Home Page: https://trajectory-transformer.github.io
License: MIT License
Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
Home Page: https://trajectory-transformer.github.io
License: MIT License
Dear Author,
I am very interested in your great works and trying to reproduce your experiment results.
Previously I have almost achieved the same score described in the older version of your paper (approximately 70 score on average in 3 * 3 datasets). But I noticed that you updated your paper in arXiv in November and the score for TT (quantile) went up to 78.9 on average in 3 * 3 datasets.
I also noticed that you listed your beam search hyperparameters in Appendix E, where k_act is 20. The listed hyperparameters has some discrepancy with your config file (config/offline.py), where default k_act is None and cdf_act is 0.6. I am wondering if you changed the hyperparameters and obtained a higher score. If so, could you please update your config file so that I can also reproduce your results?
Thanks!
Hi! I noticed one more not straightforward thing in goal conditioned version of GPT.
Here:
trajectory-transformer/trajectory/models/transformers.py
Lines 288 to 295 in e0b5f12
After you append goal embeddings to the main sequence, you do self.blocks twice. Is that how it's intended to work? Shouldn't one time be enough, since all embeddings will have all needed information about the goal due to the attention mechanism.
While trying to run your given script, I faced following problems.
Traceback (most recent call last):
File "scripts/train.py", line 6, in
import trajectory.utils as utils
File "/home/rs/18CS91P06/Bill_payment/trajectory-transformer-master/trajectory/utils/init.py", line 1, in
from .setup import Parser, watch
File "/home/rs/18CS91P06/Bill_payment/trajectory-transformer-master/trajectory/utils/setup.py", line 6, in
from tap import Tap
File "/home/rs/18CS91P06/anaconda3/envs/trajectory/lib/python3.8/site-packages/tap.py", line 6, in
from mc_bin_client import mc_bin_client, memcacheConstants as Constants
File "/home/rs/18CS91P06/anaconda3/envs/trajectory/lib/python3.8/site-packages/mc_bin_client/mc_bin_client.py", line 14, in
from memcacheConstants import REQ_MAGIC_BYTE, RES_MAGIC_BYTE
ModuleNotFoundError: No module named 'memcacheConstants'
Hello, I'm interested in the AntMaze tasks and found that the released code didn't cover the AntMaze tasks. Could you please provide the relative code (including how to combine Q-value, the concrete training/planning configs...) as well as the pretrained models? I will appreciate your help very much!!!
Hi, how can I reproduce the Trajectory predictions in Figure 2 in the paper?
Does this long-term prediction also need discrete states and actions? Hope for more details about the long-term prediction experiment implementations.
When I run "python scripts/train.py --dataset halfcheetah-medium-v2", then exception occurred : "gym.error.UnregisteredEnv: No registered env with id: halfcheetah-medium-v2". And my gym version and mujoco version are all same as environment.yml
Hi I keep getting an error from this line: from tap import Tap
What is this module? I can't seem to find it properly and it's not working well.
Hi, recently I read your paper and it inspire me a lot, and I think it is no doubt a good paper. However, I am confused about the version of D4RL dataset used for your compared baselines. I notice that in "Appendix C Baseline performance sources", the results of BC, MOPO (by the way, I didn't find MOPO in your experiment part) and MBOP are taken from their original papers, all of which use D4RL-gym-v0 datasets.
Because I find that the performance of CQL on D4RL-gym-v0^[1] is greatly different from that on D4RL-gym-v2[2] on several datasets, I wonder that will scores of the above baselines change greatly on D4RL-gym-v2, or you have evidence that this will not happen, since you compare these scores directly?
Dear author,
After installation and downloading pretrained models&plans, I still get in trouble with running the command.
python scripts/train.py --dataset halfcheetah-medium-v2
(trajectory) qz@qz:~/trajectory-transformer$ python scripts/train.py --dataset halfcheetah-medium-v2
[ utils/setup ] Reading config: config.offline:halfcheetah_medium_v2
[ utils/setup ] Not using overrides | config: config.offline | dataset: halfcheetah_medium_v2
[ utils/setup ] Made savepath: logs/halfcheetah-medium-v2/gpt/azure
[ utils/setup ] Saved args to logs/halfcheetah-medium-v2/gpt/azure/args.json
Traceback (most recent call last):
File "/home/qz/anaconda3/envs/trajectory/lib/python3.6/site-packages/gym/envs/registration.py", line 121, in spec
return self.env_specs[id]
KeyError: 'halfcheetah-medium-v2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "scripts/train.py", line 25, in
env = datasets.load_environment(args.dataset)
File "/home/qz/trajectory-transformer/trajectory/datasets/d4rl.py", line 81, in load_environment
wrapped_env = gym.make(name)
File "/home/qz/anaconda3/envs/trajectory/lib/python3.6/site-packages/gym/envs registration.py", line 145, in make
return registry.make(id, **kwargs)
File "/home/qz/anaconda3/envs/trajectory/lib/python3.6/site-packages/gym/envs/registration.py", line 89, in make
spec = self.spec(path)
File "/home/qz/anaconda3/envs/trajectory/lib/python3.6/site-packages/gym/envs/registration.py", line 131, in spec
raise error.UnregisteredEnv('No registered env with id: {}'.format(id))
gym.error.UnregisteredEnv: No registered env with id: halfcheetah-medium-v2
Thank you very much for your attention.
Thank you for such an interesting work.
Im really interested in your works and trying to understand your code,
but I wonder about why the head network outputs the "#vocabulary + 1".
Can you explain this for me?
Hi,
May I ask is it possible to provide the code based on jax.
Best
Hi,
I encounter an error when creating the Conda environment. It seems that the version of dm-control (from D4RL) is unavailable.
Is there any recommended version for the D4RL package? The following is the full error message.
Pip subprocess error:
Running command git clone -q https://github.com/JannerM/d4rl.git /tmp/pip-req-build-enrmm6ao
Running command git rev-parse -q --verify 'sha^d5719e2c6ef6ab3b1c678a846c02621abb8074a4'
Running command git fetch -q https://github.com/JannerM/d4rl.git d5719e2c6ef6ab3b1c678a846c02621abb8074a4
Running command git checkout -q d5719e2c6ef6ab3b1c678a846c02621abb8074a4
WARNING: Missing build requirements in pyproject.toml for mujoco-py==2.0.2.13 from https://files.pythonhosted.org/packages/2f/48/b108057c1a23c8da9f4cdc7a7c46ab7cec49c3563c0706d50f2527de6ba0/mujoco-py-2.0.2.13.tar.gz#sha256=d6ae66276b565af9063597fda70683a89c7356290f5ac3961b794ee90ec50eea (from -r /708HDD/hungyh/trajectory-transformer/condaenv.l4fl7x8g.requirements.txt (line 4)).
WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.
Running command git clone -q git://github.com/deepmind/dm_control /tmp/pip-install-ehe12w5d/dm-control_d0b0cee6667746188485b8f85955e996
fatal: unable to connect to github.com:
github.com[0: 20.27.177.113]: errno=Connection timed out
WARNING: Discarding git+git://github.com/deepmind/dm_control@90f00e4e80af56abb9f905070d0c152845db5602#egg=dm_control. Command errored out with exit status 128: git clone -q git://github.com/deepmind/dm_control /tmp/pip-install-ehe12w5d/dm-control_d0b0cee6667746188485b8f85955e996 Check the logs for full command output.
Running command git clone -q git://github.com/aravindr93/mjrl /tmp/pip-install-ehe12w5d/mjrl_82435f9d000845a69223ca68a1c237e4
fatal: unable to connect to github.com:
github.com[0: 20.27.177.113]: errno=Connection timed out
WARNING: Discarding git+git://github.com/aravindr93/mjrl@master#egg=mjrl. Command errored out with exit status 128: git clone -q git://github.com/aravindr93/mjrl /tmp/pip-install-ehe12w5d/mjrl_82435f9d000845a69223ca68a1c237e4 Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement dm-control (unavailable) (from d4rl) (from versions: 0.0.286587932, 0.0.286955599, 0.0.288398964, 0.0.288483845, 0.0.295778102, 0.0.300771433, 0.0.312466143, 0.0.318037100, 0.0.318066097, 0.0.319497192, 0.0.322773188, 0.0.355168290, 0.0.364896371)
ERROR: No matching distribution found for dm-control (unavailable)
failed
CondaEnvException: Pip failed
Thanks!
Hello, I'm interested in the AntMaze tasks and notice that currently the pretrained models in AntMaze are not provided. Will you provide the pretrained models in AntMaze in the future? Thank you!
Hello, may I ask if I called make while running plan. py. When make_rendering (), the following error is reported in the "self. viewer=mjc. MjRenderContextOffscreen (self. env. sim)" step:
Found 0 GPUs for rendering. Using device 0.
Device id outside of range of available devices.
Traceback (most recent call last):
File "scripts/plan.py", line 41, in
renderer = utils.make_renderer(args)
File "/home/user/projects/trajectory-transformer/trajectory/utils/rendering.py", line 22, in make_renderer
return render_class(args.dataset, observation_dim=observation.size)
File "/home/user/projects/trajectory-transformer/trajectory/utils/rendering.py", line 96, in init
self.viewer = mjc.MjRenderContextOffscreen(self.env.sim,device_id=-1)
File "mjrendercontext.pyx", line 46, in mujoco_py.cymj.MjRenderContext.init
File "mjrendercontext.pyx", line 114, in mujoco_py.cymj.MjRenderContext._setup_opengl_context
File "opengl_context.pyx", line 130, in mujoco_py.cymj.OffscreenOpenGLContext.init
RuntimeError: Failed to initialize OpenGL.
How can I solve this problem?
Hi,
Here I saw the paper where you drew the trajectories of four rooms environment in Figure 6. Where the observation in this environment is based on pictures, could you share your code about how you draw the trajectories?
Hi,
I'm having trouble understanding how you calculate expectation from probabilities and thresholds here.
trajectory-transformer/trajectory/utils/discretization.py
Lines 108 to 123 in c77076d
I understand that thresholds are quantiles calculated from the empirical distribution, but it's hard for me to grasp why you can get the expectation from the average of those two matrix multiplications.
Could you give me the explanation or a page or something to look at?
Thanks.
Hi!
I noticed that I can't get good results on the HalfCheetah environment with imitation learning (with plain beam search decoding by logprob) even after long training and without overfitting (but can on Hopper). I also noticed that in the paper only results on Hopper and Walker2d are presented for imitation learning section.
Have you encountered the same difficulties? Or haven't considered testing in this environment? If so, where there any particular reasons for this?
Do you have any recommendations or resources you could point me to for preparing a novel dataset for use in Trajectory Transformer?
Hi, I run this project without Docker, just Vscode.
However, when I run
python scripts/train.py --dataset halfcheetah-medium-v2
It goes like this:
How can I fix it ?
Thank you so much.
python scripts/train.py --dataset halfcheetah-medium-v2
Warning: Mujoco-based envs failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'mjrl'
Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'flow'
Warning: CARLA failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message.
No module named 'carla'
pybullet build time: May 20 2022 19:44:17
[ utils/setup ] Reading config: config.offline:halfcheetah_medium_v2
[ utils/setup ] Not using overrides | config: config.offline | dataset: halfcheetah_medium_v2
[ utils/setup ] Saved args to logs/halfcheetah-medium-v2/gpt/azure/args.json
Traceback (most recent call last):
File "/home/**/software/anaconda3/envs/trajectory/lib/python3.8/site-packages/gym/envs/registration.py", line 121, in spec
return self.env_specs[id]
KeyError: 'halfcheetah-medium-v2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "scripts/train.py", line 26, in <module>
env = datasets.load_environment(args.dataset)
File "/home/**/Desktop/TT/trajectory-transformer/trajectory/datasets/d4rl.py", line 84, in load_environment
wrapped_env = gym.make(name)
File "/home/**/software/anaconda3/envs/trajectory/lib/python3.8/site-packages/gym/envs/registration.py", line 145, in make
return registry.make(id, **kwargs)
File "/home/**/software/anaconda3/envs/trajectory/lib/python3.8/site-packages/gym/envs/registration.py", line 89, in make
spec = self.spec(path)
File "/home/**/software/anaconda3/envs/trajectory/lib/python3.8/site-packages/gym/envs/registration.py", line 131, in spec
raise error.UnregisteredEnv('No registered env with id: {}'.format(id))
gym.error.UnregisteredEnv: No registered env with id: halfcheetah-medium-v2
Hi! First of all, thank you for such an interesting work!
I'm trying to figure out how trajectories are represented in this work. As far as I understand, after transformer blocks we get
[batch, block_size, embedding_dim]
shapes. In a normal transformer we would just pass this to the head, for example nn.Linear(embedding_dim, vocab_size)
and get logits for prediction.
Why wouldn't that work? What's the intuition behind such padding and reshape (and ein linear) that you do? It doesn't seem to be mentioned in the paper.
Also, what is stop token? Seems like there is no special cases for ending in beam plan. Is this just for done?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.