conglu1997 / v-d4rl Goto Github PK
View Code? Open in Web Editor NEWChallenges and Opportunities in Offline Reinforcement Learning from Visual Observations
License: MIT License
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
License: MIT License
Dear developer,
Thanks for your work.
I followed your instructions strictly while failed to run DrQ+BC evaluation with the following error:
Traceback (most recent call last): File "drqbc/train.py", line 20, in <module> import dmc File "/vd4rl/drqbc/dmc.py", line 14, in <module> from envs.distracting_control.suite import distracting_wrapper ModuleNotFoundError: No module named 'envs'
Have you run into a similar problem before?
Many thanks,
Levi
Hello,
The new paper is so inspiring!!
When I ran this command python drqbc/train.py task_name=offline_walker_walk_random offline_dir=vd4rl_data/main/walker_walk/random/84px nstep=3 seed=0
, I met an error here:
Error executing job with overrides: ['task_name=offline_walker_walk_random', 'offline_dir=vd4rl_data/main/walker_walk/random/84px', 'nstep=3', 'seed=0']
Traceback (most recent call last):
File "drqbc/train.py", line 315, in main
workspace.train_offline(cfg.dataset_dir)
File "/home/mgz/project/v-d4rl-main/drqbc/train.py", line 283, in train_offline
metrics = self.agent.update(self.replay_buffer, self.global_step)
File "/home/mgz/project/v-d4rl-main/drqbc/drqv2.py", line 263, in update
batch = next(replay_buffer)
File "/home/mgz/project/v-d4rl-main/drqbc/numpy_replay_buffer.py", line 92, in __next__
indices = np.random.choice(self.valid.nonzero()[0], size=self.batch_size)
AttributeError: 'EfficientReplayBuffer' object has no attribute 'valid'
I create the environment using drqbc/conda_env.yml
without any other change. How can I resolve this issue? Thanks!
For all tasks that are prone to confusion, it is recommended to provide a complete list of all tasks for V-D4RL.
Similar to this https://github.com/Farama-Foundation/d4rl/wiki/Tasks
Hi, I saw that you had uploaded the dataset to torchrl repository recently. It is amazing that I can access easily with torch tensordict. However I am writing to report an issue I've encountered while attempting to download the main->humanoid_walk->medium-replay dataset from the V-D4RL benchmarks through torchrl.
(I directly pulled torchrl package from github repo and also tensordict too , pip install seems like it hasnt been updated yet)
(…)03b080197e0b44c08694cb699fff5ce6-501.npz: 100%|██████████████████████████████████| 2.30M/2.30M [00:20<00:00, 112kB/s]
file=/tmp/tmpdnm5926d/datasets--conglu--vd4rl/snapshots/6001dd3a96d44c22e2a6c5c8f937ba0f840c4d50/vd4rl/main/humanoid_wal
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[1], line 11
9 for pixel in [64, 84]:
10 print(f'task, type, pixel:, {task}, {type}, {pixel}')
---> 11 d = VD4RLExperienceReplay(f"main/{task}/{type}/{pixel}px", batch_size=4, image_size=50, download='force')
12 for batch in d:
13 print(batch)
File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:200, in VD4RLExperienceReplay.__init__(self, dataset_id, batch_size, root, download, sampler, writer, collate_fn, pin_memory, prefetch, transform, split_trajs, totensor, image_size, **env_kwargs)
198 except FileNotFoundError:
199 pass
--> 200 storage = self._download_and_preproc(dataset_id, data_path=self.data_path)
201 elif self.split_trajs and not os.path.exists(self.data_path):
202 storage = self._make_split()
File ~/anaconda3/envs/vd4rl/lib/python3.9/site-packages/torchrl/data/datasets/vd4rl.py:308, in VD4RLExperienceReplay._download_and_preproc(cls, dataset_id, data_path)
306 td_save = tdc[0]
307 tds.append(td)
--> 308 total_steps += td.shape[0]
310 # From this point, the local paths are non needed anymore
311 td_save = td_save.expand(total_steps).memmap_like(data_path, num_threads=32)
IndexError: tuple index out of range
This issue has prevented me from successfully downloading only the humanoid-medium-replay dataset. I've verified that my setup and versions are compatible as per the documentation, yet the problem persists. I think some files(maybe .npz files) has been omitted for some reason in hugging_face hub or somewhere.
Could you please look into this matter? Any guidance on resolving this error or confirming whether this might be a known issue with a potential workaround would be highly appreciated.
Thank you very much for your time and assistance.
Hi~
I want to know whether image observation of the Franka Kitchen dataset supported?
Thank you !
Hi, thank you for this great work! I noticed that the visual observations are generated by a proprioceptive SAC agent, but could not find the states corresponding to the images in the downloaded dataset; is it possible to acquire the proprioceptive states somewhere? I saw the behavior agent training script in the README, but it seems hard to deterministically reproduce the training / data collection process, as various sources of randomness are present. Thank you for your time!
Hi again!. Thanks for forwarding this problem to author of torchrl.
However, the author �raised a issue that it can be a problem of vd4rl dataset.(maybe some omitted dataset)
It seems like it has to get one more or less data(maybe .npz file) according to error message!
TensorDict(
fields={
action: MemoryMappedTensor(shape=torch.Size([500, 21]), device=cpu, dtype=torch.float32, is_shared=False),
discount: MemoryMappedTensor(shape=torch.Size([500]), device=cpu, dtype=torch.float64, is_shared=False),
image: MemoryMappedTensor(shape=torch.Size([500, 64, 64, 3]), device=cpu, dtype=torch.uint8, is_shared=False),
is_first: MemoryMappedTensor(shape=torch.Size([501]), device=cpu, dtype=torch.bool, is_shared=False),
is_last: MemoryMappedTensor(shape=torch.Size([501]), device=cpu, dtype=torch.bool, is_shared=False),
is_terminal: MemoryMappedTensor(shape=torch.Size([501]), device=cpu, dtype=torch.bool, is_shared=False),
reward: MemoryMappedTensor(shape=torch.Size([500]), device=cpu, dtype=torch.float64, is_shared=False)},
batch_size=torch.Size([]),
device=cpu,
is_shared=False)
I'm wondering whether you failed to upload a single file to somewhere.
Could you please look into this matter once again?
Appreciate it for your time.
I think it will be powerful and become very easy to access to this dataset, if it supports torchrl perfectly.
Hi, this might be not the place but I am just wondering what hyperparameters did you use to train the SAC agent (data collection policy) for Humanoid Walk? The default hyperparameters successfully achieve expert level performance for 1M steps for Walker Walk and Cheetah Run. I use this codebase as mentioned in the README.
Thank you for your outstanding work. May I ask why the timelimit for DMC vision is 0 and not 1000?
Another thing I'd like to confirm is that in each episode, the first timestep's "action" and "reward" are 0 right, because they're offset by one step from "image"?
Also if I want to change the dataset related code to a pytorch version, how should I make sure it uses the full offline data? In the current code it looks like it selects sub-trajectory randomly.
As the paper noted, the experimental results is averaged over six random seeds. Can I ask how many eval_episodes were used for each methods (DV2, CQL,et al) in the evaluation phase, as I found the visual input settings(V-D4RL) are more unstable compared to proprioceptive states (D4RL).
Hi, I am currently using the default parameter config setting(dmc_vision,dmc_walker_walk) to train offline DV2 with mixed walker_walk dataset, but the eval return seems to have a big difference compared with the result in the paper. Here are my questions, hope to see your response.
Here is my world model training loss curve screenshot(mixed dataset)
Here is my agent training loss curve (the last checkpoint of the world model) (mixed dataset)
Hi,
Upon reading the code, I think the n-step reward computation is wrong ie.
def gather_nstep_indices(self, indices):
n_samples = indices.shape[0]
all_gather_ranges = np.stack([np.arange(indices[i] - self.frame_stack, indices[i] + self.nstep)
for i in range(n_samples)], axis=0) % self.buffer_size
gather_ranges = all_gather_ranges[:, self.frame_stack:] # bs x nstep
obs_gather_ranges = all_gather_ranges[:, :self.frame_stack]
nobs_gather_ranges = all_gather_ranges[:, -self.frame_stack:]
all_rewards = self.rew[gather_ranges]
# Could implement below operation as a matmul in pytorch for marginal additional speed improvement
rew = np.sum(all_rewards * self.discount_vec, axis=1, keepdims=True)
If we take an example of self.frame_stack=1
and self.nstep=1
and lets say indices[0] = 1
, supposedly the experiences are written as (s, a, r, s', a', r', s'')
, then the sampled experience will be (s, a, r', s')
instead of (s, a, r, s')
. The fix will be
gather_ranges = all_gather_ranges[:, self.frame_stack-1:-1] # bs x nstep
What do you think? Did I miss something?
Hi, I have two questions.
Anyway, thank you for the nice codebase!
It seems that there is only 100 npz files in hugging face (while 400 npz files are in Google drive)
Can you update the hugging face repository?
Thanks for the lovely open-source benchmark. However, I noticed inconsistencies between the 64px and 84px datasets. Specifically, I observed that the number of transitions in the 64px dataset is less than in the 84px one. For instance, the cheetah_run_medium_replay dataset in 84px has 200k transitions (which aligns with the description in your paper), but the 64px version only has 50k. I'm wondering if you might have missed uploading some of the data for the 64px version.
Hello again!
When I tried the given commands
python drqbc/train.py task_name=offline_${ENVNAME}_${TYPE} offline_dir=vd4rl_data/main/${ENV_NAME}/${TYPE}/84px nstep=3 seed=0
Here was an error of:
Could not override 'offline_dir'.
To append to your config use +offline_dir=vd4rl_data/main/walker_walk/expert/84px
Key 'offline_dir' is not in struct
full_key: offline_dir
object_type=dict
Hence I tried to change the offline_dir
to dataset_dir
, then it can work. Am I right?
python drqbc/train.py task_name=offline_${ENVNAME}_${TYPE} dataset_dir=vd4rl_data/main/${ENV_NAME}/${TYPE}/84px nstep=3 seed=0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.