liruiw / ga-ddpg Goto Github PK

View Code? Open in Web Editor NEW

88.0 4.0 17.0 20.74 MB

6D Grasping Policy from Point Clouds

Home Page: https://sites.google.com/view/gaddpg

License: MIT License

Python 98.50% Shell 1.50%

reinforcement-learning grasping point-cloud imitation-learning

ga-ddpg's Introduction

GA-DDPG

[website, paper]

Installation

git clone https://github.com/liruiw/GA-DDPG.git --recursive

Setup: Ubuntu 16.04 or above, CUDA 10.0 or above, python 2.7 / 3.6
- (Required for Training) - Install OMG submodule and reuse conda environment.
- (Docker) See OMG Docker for details.
- (Demo) - Install GA-DDPG inside a new conda environment
```
conda create --name gaddpg python=3.6.9
conda activate gaddpg
pip install -r requirements.txt
```
Install PointNet++
Download environment data bash experiments/scripts/download_data.sh

Pretrained Model Demo

Download pretrained models bash experiments/scripts/download_model.sh
Demo model test bash experiments/scripts/test_demo.sh

Example 1	Example 2

Save Data and Offline Training

Download example offline data bash experiments/scripts/download_offline_data.sh The .npz dataset (saved replay buffer) can be found in data/offline_data and can be loaded for training (there are several deprecated attributes). The image version of the offline buffer can be found here.
To save extra gpus for online rollouts, use the offline training script bash ./experiments/scripts/train_offline.sh bc_aux_dagger.yaml BC
Saving dataset bash ./experiments/scripts/train_online_save_buffer.sh bc_save_data.yaml BC.

Online Training and Testing

We use ray for parallel rollout and training. The training scripts might require adjustment according to the local machine. See config.py for some notes.
Training online bash ./experiments/scripts/train_online_visdom.sh td3_critic_aux_policy_aux.yaml DDPG. Use visdom and tensorboard to monitor.
Testing on YCB objects bash ./experiments/scripts/test_ycb.sh demo_model. Replace demo_model with trained models. Logs and videos would be saved to output_misc

Note

Checkout core/test_realworld_ros_final.py for an example of real-world usages.
Related Works (OMG, ACRONYM, 6DGraspNet, 6DGraspNet-Pytorch, ContactGraspNet, Unseen-Clustering)
To use the full Acronym dataset with Shapenet meshes, please follow ACRONYM to download the meshes and grasps and follow OMG-Planner to process and save in /data. filter_shapenet.json can then be used for training.
Please use Github issue tracker to report bugs. For other questions please contact Lirui Wang.

File Structure

├── ...
├── GADDPG
|   |── data 		# training data
|   |   |── grasps 		# grasps from the ACRONYM dataset
|   |   |── objects 		# object meshes, sdf, urdf, etc
|   |   |── robots 		# robot meshes, urdf, etc
|   |   └── gaddpg_scenes	 	# test scenes
|   |── env 		# environment-related code
|   |   |── panda_scene 		# environment and task
|   |   └── panda_gripper_hand_camera 		# franka panda with gripper and camera
|   |── OMG 		# expert planner submodule
|   |── experiments 		# experiment scripts
|   |   |── config 		# hyperparameters for training, testing and environment
|   |   |── scripts 		# main running scripts
|   |   |── model_spec 		# network architecture spec
|   |   |── cfgs 		# experiment config and hyperparameters
|   |   └── object_index 		# object indexes
|   |── core 		# agents and learning
|   |   |──  train_online 		# online training
|   |   |──  train_test_offline 	# testing and offline training
|   |   |──  network 		# network architecture
|   |   |──  test_realworld_ros_final 		# real-world script example
|   |   |──  agent 		# main agent code
|   |   |──  replay_memory 		# replay buffer
|   |   |──  trainer 	# ray-related training setup
|   |   └── ...
|   |── output 		# trained model
|   |── output_misc 	# log and videos
|   └── ...
└── ...

Citation

If you find GA-DDPG useful in your research, please consider citing:

@inproceedings{wang2021goal,
author    = {Lirui Wang, Yu Xiang, Wei Yang, Arsalan Mousavian, and Dieter Fox},
title     = {Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds},
booktitle = {The Conference on Robot Learning (CoRL)},
year      = {2021}
}

License

The GA-DDPG is licensed under the MIT License.

ga-ddpg's People

Contributors

Stargazers

Watchers

Forkers

ccc1711 lianhui1993 metavai hiyyg littlefiverobot dandelight h-hchen joannanananana rosnovice drizzlydreamer hwpengtristin tingsunxx ozgucbertug nvlabs yuki1003

ga-ddpg's Issues

The question about run GA-DDPG in real world

Dear Liruiw:
I have a question about run GA-DDPG in real world.
You used the depth heuristic in test_realworld_ros_final.py, but I can't find the function self.graspnet.compute_grasps_score() in the function that calculates the quality of the grasp in termination_heuristics(). Can I use the depth_termination_heuristics(depth_img, mask_img) function in utils to calculate whether the grasp is possible or not?

About online training

Dear Liruiw:

I have a question about online training.

According to README.md, ./experiments/scripts/train_online_visdom.sh should be used for online training, but I can't find this script, so I instead follow ./experiments/scripts/train_online.sh and use the command

PYTHONUNBUFFERED=True CUDA_VISIBLE_DEVICES=0,1,2 python -m core.train_online --save_model --config_file td3_critic_aux_policy_aux.yaml --policy DDPG --log --fix_output_time ddpg_model_233_300000_18 --seed 233 --max_epoch 300000

where the argument max_epoch is added in train_online.py to control the number of iterations, which is by default 80000 in ./experiments/cfgs/td3_critic_aux_policy_aux.yaml.

I trained several models with different max_epoch and seed, but I can not get the same success rate as ./output/demo_model. The success rates of all the models are shown below.

demo_model
Avg. Performance: (Return: 0.941 +- 0.02778) (Success: 0.941 +- 0.02778)
+---------------------+---------+-----------+
| object name         |   count |   success |
|---------------------+---------+-----------|
| 003_cracker_box     |      30 |        22 |
| 004_sugar_box       |      30 |        27 |
| 005_tomato_soup_can |      30 |        27 |
| 006_mustard_bottle  |      30 |        30 |
| 010_potted_meat_can |      30 |        30 |
| 021_bleach_cleanser |      30 |        30 |
| 024_bowl            |      30 |        30 |
| 025_mug             |      30 |        29 |
| 061_foam_brick      |      30 |        29 |
+---------------------+---------+-----------+

seed=233, max_epoch=80000
Avg. Performance: (Return: 0.641 +- 0.01667) (Success: 0.641 +- 0.01667)
+---------------------+---------+-----------+
| object name         |   count |   success |
|---------------------+---------+-----------|
| 003_cracker_box     |      30 |        13 |
| 004_sugar_box       |      30 |        22 |
| 005_tomato_soup_can |      30 |        23 |
| 006_mustard_bottle  |      30 |        19 |
| 010_potted_meat_can |      30 |         8 |
| 021_bleach_cleanser |      30 |        19 |
| 024_bowl            |      30 |        30 |
| 025_mug             |      30 |        16 |
| 061_foam_brick      |      30 |        23 |
+---------------------+---------+-----------+

seed=987, max_epoch=300000
Avg. Performance: (Return: 0.822 +- 0.02222) (Success: 0.822 +- 0.02222)
+---------------------+---------+-----------+
| object name         |   count |   success |
|---------------------+---------+-----------|
| 003_cracker_box     |      30 |        26 |
| 004_sugar_box       |      30 |        25 |
| 005_tomato_soup_can |      30 |        24 |
| 006_mustard_bottle  |      30 |        29 |
| 010_potted_meat_can |      30 |        18 |
| 021_bleach_cleanser |      30 |        27 |
| 024_bowl            |      30 |        30 |
| 025_mug             |      30 |        21 |
| 061_foam_brick      |      30 |        22 |
+---------------------+---------+-----------+

Also, I tested the above models in handover-sim, and observed distinct success rates in hold setting:

model	success rate
demo_model	77/144=0.535
seed=233, max_epoch=80000	27/144=0.188
seed=987, max_epoch=300000	15/144=0.104

The training curve of actor_critic_loss is shown below:

Could you please help me out? Thank you.

how to align Acronym scale

Dear liruiw:
Thanks for your aswesome work! You mentioned that you used the Acronym dataset.
I notice that extra_shape.json contains many items about shapenet objects like 'Bottle_dc0926ce09d6ce78eb8e919b102c6c08_0.0224658542.json'.
But I find that there is only 'Bottle_dc0926ce09d6ce78eb8e919b102c6c08_0.008781243490891297.h5' in the Acronym dataset. They are not aligned. How can I handle this problem? Should I scale the translation from 0.008781243490891297 to 0.0224658542 and do not change the rotation for the Acronym dataset?
By the way, npy files in data/grasps/simulated/ contain 100 grasps, h5 files in the Acronym dataset contain 2000 dataset. I wonder how npy files are generated from h5 files(random?).

Segmentation fault error

While running bash experiments/scripts/test_demo.sh

I'm getting Segmentation fault error. Could you please suggest the cause of such error.

Below is the ran script for reference:

(gaddpg) akashsingh@MSI:~/GA-DDPG$ bash experiments/scripts/test_demo.sh

set -e
+++ dirname -- experiments/scripts/test_demo.sh
++ cd experiments/scripts
++ pwd
DIR=/home/akashsingh/GA-DDPG/experiments/scripts
export PYTHONUNBUFFERED=True
PYTHONUNBUFFERED=True
LOG_NAME=agent
LOG=output//log.txt
MODEL_NAME=dummy
RUN_NUM=3
EPI_NUM=165
EPOCH=latest
++ date +%Y-%m-%d_%H-%M-%S
LOG=outputs/dummy/test_log.txt.2021-10-07_16-19-54
exec
++ tee -a outputs/dummy/test_log.txt.2021-10-07_16-19-54
tee: outputs/dummy/test_log.txt.2021-10-07_16-19-54: No such file or directory
echo Logging output to outputs/dummy/test_log.txt.2021-10-07_16-19-54
Logging output to outputs/dummy/test_log.txt.2021-10-07_16-19-54
python -m core.train_test_offline --expert --pretrained output/demo_model --test --render
pybullet build time: May 8 2021 05:48:32
Output will be saved to output/demo_model
Using config:
{'DATA_ROOT_DIR': 'data/scenes',
'EPOCHS': 200,
'EXPERIMENT_OBJ_INDEX_DIR': 'experiments/object_index',
'IMG_SIZE': [112, 112],
'LOG': True,
'MODEL_SPEC_DIR': 'experiments/model_spec',
'OBJECT_DATA_DIR': 'data/objects',
'OFFLINE_BATCH_SIZE': 100,
'OFFLINE_RL_MEMORY_SIZE': 100000,
'ONPOLICY_MEMORY_SIZE': -1,
'OUTPUT_DIR': 'output',
'OUTPUT_MISC_DIR': 'output_misc',
'RL_DATA_ROOT_DIR': 'data/scenes',
'RL_IMG_SIZE': [112, 112],
'RL_MAX_STEP': 20,
'RL_MEMORY_SIZE': 2000000,
'RL_MODEL_SPEC': 'output/demo_model/rl_pointnet_model_spec.yaml',
'RL_SAVE_DATA_NAME': 'data_50k.npz',
'RL_SAVE_DATA_ROOT_DIR': 'data',
'RL_TEST_SCENE': 'data/gaddpg_scenes',
'RL_TRAIN': {'DAGGER_MAX_STEP': 18,
'DAGGER_MIN_STEP': 5,
'DAGGER_RATIO': 0.5,
'DART_MAX_STEP': 18,
'DART_MIN_STEP': 5,
'DART_RATIO': 0.5,
'ENV_FAR': 0.5,
'ENV_NEAR': 0.2,
'ENV_RESET_TRIALS': 7,
'EXPERT_INIT_MAX_STEP': 15,
'EXPERT_INIT_MIN_STEP': 3,
'RL': True,
'SAVE_EPISODE_INTERVAL': 50,
'accumulate_points': True,
'action_noise': 0.01,
'batch_size': 125,
'bc_reward_flag': False,
'buffer_full_size': -1,
'buffer_start_idx': 0,
'change_dynamics': False,
'channel_num': 5,
'clip_grad': 0.5,
'concat_option': 'point_wise',
'critic_aux': True,
'critic_extra_latent': -1,
'critic_goal': False,
'dagger': True,
'dart': True,
'ddpg_coefficients': [0.5, 0.001, 1.0003, 1.0, 0.2],
'domain_randomization': False,
'env_name': 'PandaYCBEnv',
'env_num_objs': 1,
'expert_initial_state': False,
'explore_cap': 1.0,
'explore_ratio': 1.0,
'explore_ratio_list': [0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7],
'feature_input_dim': 512,
'fill_data_step': 10,
'fix_timestep_test': True,
'gamma': 0.95,
'goal_reward_flag': False,
'head_lr': 0.0003,
'hidden_size': 256,
'index_file': 'experiments/object_index/extra_shape.json',
'index_split': 'train',
'init_distance_high': 0.45,
'init_distance_low': 0.15,
'load_buffer': False,
'load_obj_num': 40,
'load_scene_joint': False,
'load_test_scene_new': False,
'log': True,
'lr': 0.0003,
'lr_gamma': 0.5,
'max_epoch': 150000,
'max_num_pts': 20000,
'mix_milestones': [4000,
8000,
16000,
25000,
35000,
45000,
65000,
85000,
100000,
120000],
'mix_policy_ratio_list': [0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2],
'mix_value_ratio_list': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
'new_scene': True,
'noise_ratio_list': [3.0, 2.5, 2.0, 1.5, 1.2, 1.2, 1, 0.8, 0.5],
'noise_type': 'uniform',
'num_remotes': 8,
'off_policy': True,
'online_buffer_ratio': 0.7,
'onpolicy': True,
'overwrite_feat_milestone': [],
'policy_aux': True,
'policy_extra_latent': -1,
'policy_goal': False,
'policy_milestones': [20000, 40000, 60000, 80000],
'policy_update_gap': 2,
'pt_accumulate_ratio': 0.95,
'refill_buffer': True,
'reinit_factor': 3,
'reinit_lr': 0.0001,
'reinit_optim': False,
'sa_channel_concat': True,
'save_epoch': [3000,
10000,
20000,
40000,
80000,
100000,
140000,
180000,
200000],
'self_supervision': False,
'shared_feature': False,
'shared_objects_across_worker': False,
'target_update_interval': 3000,
'tau': 0.0001,
'train_feature': True,
'train_goal_feature': False,
'train_value_feature': True,
'uniform_num_pts': 1024,
'updates_per_step': 20,
'use_action_limit': True,
'use_expert_plan': False,
'use_image': False,
'use_point_state': True,
'use_time': True,
'value_lr': 0.0003,
'value_lr_gamma': 0.5,
'value_milestones': [20000, 40000, 60000, 80000],
'value_model': True,
'visdom': True},
'RNG_SEED': 3,
'ROBOT_DATA_DIR': 'data/robots',
'ROOT_DIR': '/home/akashsingh/GA-DDPG/experiments/../',
'SCRIPT_FOLDER': 'experiments/cfgs',
'env_config': {'accumulate_points': True,
'action_space': 'task6d',
'change_dynamics': False,
'data_type': 'RGBDM',
'domain_randomization': False,
'expert_step': 20,
'height': 112,
'img_resize': [112, 112],
'initial_far': 0.5,
'initial_near': 0.2,
'numObjects': 1,
'omg_config': {'allow_collision_point': 0,
'base_obstacle_weight': 1.0,
'clearance': 0.03,
'dynamic_timestep': False,
'extra_smooth_steps': 5,
'goal_idx': -1,
'ik_clearance': 0.07,
'ik_parallel': False,
'ik_seed_num': 13,
'increment_iks': True,
'ol_alg': 'Proj',
'optim_steps': 1,
'pre_terminate': True,
'root_dir': '/home/akashsingh/GA-DDPG/experiments/../',
'scene_file': '',
'silent': True,
'smoothness_base_weight': 3,
'standoff_dist': 0.08,
'target_epsilon': 0.06,
'target_hand_filter_angle': 90,
'target_obj_collision': 1,
'terminate_smooth_loss': 3,
'timesteps': 20,
'traj_delta': 0.05,
'traj_init': 'grasp',
'traj_interpolate': 'linear',
'traj_max_step': 26,
'traj_min_step': 15,
'use_expert_plan': False,
'vis': False},
'pt_accumulate_ratio': 0.95,
'random_target': True,
'regularize_pc_point_count': True,
'uniform_num_pts': 1024,
'use_hand_finger_point': True,
'width': 112},
'omg_config': {'allow_collision_point': 0,
'base_obstacle_weight': 1.0,
'clearance': 0.03,
'dynamic_timestep': False,
'extra_smooth_steps': 5,
'goal_idx': -1,
'ik_clearance': 0.07,
'ik_parallel': False,
'ik_seed_num': 13,
'increment_iks': True,
'ol_alg': 'Proj',
'optim_steps': 1,
'pre_terminate': True,
'root_dir': '/home/akashsingh/GA-DDPG/experiments/../',
'scene_file': '',
'silent': True,
'smoothness_base_weight': 3,
'standoff_dist': 0.08,
'target_epsilon': 0.06,
'target_hand_filter_angle': 90,
'target_obj_collision': 1,
'terminate_smooth_loss': 3,
'timesteps': 20,
'traj_delta': 0.05,
'traj_init': 'grasp',
'traj_interpolate': 'linear',
'traj_max_step': 26,
'traj_min_step': 15,
'use_expert_plan': False,
'vis': False},
'pretrained_time': '',
'script_name': 'td3_critic_aux_policy_aux.yaml'}
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
Output will be saved to output/demo_model
video output: YCB_td3_critic_aux_policy_aux.yaml stat output: rollout_success.script_td3_critic_aux_policy_aux.yaml.txt
load pretrained policy!!!!
load pretrained critic!!!!
load feat optim
load pretrained feature!!!! from: output/demo_model/DDPG_state_feat_PandaYCBEnv_latest step :249921
output_time: demo_model logdir: output/demo_model/PandaYCBEnv_DDPG
argv[0]=--opengl2
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=3
argv[0] = --unused
argv[1] = --opengl2
argv[2] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
4
```
  visual 0x21 selected
```

GL_VENDOR=Intel
GL_RENDERER=Intel(R) UHD Graphics 630
GL_VERSION=1.4 (4.6.0 - Build 30.0.100.9922)
GL_SHADING_LANGUAGE_VERSION=(null)
pthread_getconcurrency()=0
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0
MotionThreadFunc thread started
experiments/scripts/test_demo.sh: line 19: 3313 Segmentation fault python -m core.train_test_offline --expert --pretrained output/demo_model --test --render

discrepancies between the paper and the code

Thanks for your awsome work! I have read your paper and your code, but I find that there are some discrepancies between the paper and the code:

In the experiments/config.py file, the flag RL_TRAIN.dagger is set to False, indicating that this code does not employ the Dagger algorithm when explore = True. I am curious whether it is necessary to change this flag to True. Additionally, I would like to know if there are any other configurations that might differ from those outlined in the paper or the config.yaml file of the provided demo model (batch_size? buffer_size?).
I find that the goal_reward_mask is need for calculating the grasp_aux_loss for both the critic and the actor:
self.policy_grasp_aux_loss = goal_pred_loss(self.aux_pred[self.goal_reward_mask, :7], self.target_grasp_batch[self.goal_reward_mask, :7] ) #agent.py L151
However, I've observed that when the reward is greater than 0, the goal_reward_mask is set to 1:
self.goal_reward_mask = torch.ones_like(self.time_batch).bool() * self.reward_mask #agent.py L246
This implies that the code computes the grasp_aux_loss only when the reward is greater than 0 (final step of a successful grasp). Is that true?

Thanks for your great work and patience!

The question about model training

Dear Liruiw:
I have a question about model training.
I would like to know the idea of model training is to train offline first, using behavioral clones to learn the strategy of omg planning, and then train online, using the trained model to interact with the environment and fine-tune the strategy with hindsight goals

BC Image Data

Hi Lirui,

Thanks for publishing this wonderful work!

I'm interested in behavior cloning in this environment, from images (not pointclouds). It seems like the data from here does not include image observations. Is there a way to regenerate the expert demonstrations, and record image observations?

Thanks,
Allan

How to obtain the RGBD image state during test process

How to obtain the RGBD image state during test process. When I run code of train_test_offline.py, I found that the image state is empty in the following python code:

replay_memory.py
data = np.load(
os.path.join(data_dir, self.save_data_name),
allow_pickle=True,
mmap_mode="r",
)

data['image_state'].size() ==> (?,1), and not the size of (?, 4, 112, 112)

I further check the data in 'GA-DDPG/data/offline_data/data_5w.npz':

I would like to know how to obtain the image_state during test process.

filter_shapenet files

Hi Lirui,

Thanks again for this awesome work. Is there any chance that the full shapenet training data can be made available? I assume these are the objects listed in experiments/object_index/filter_shapenet.json.

I tried following the README instructions for processing the shapes data myself but have run into various issues when trying to use OMG-Planner's python -m real_world.process_shape. For example, I'm having a little difficulty getting meshlabserver installed and running properly on my machine.

missing `meshes/collision/link0.obj` file

Hello,
I think I got it all up and running however, it seems that there are some missing files.
Could you please see the following errors when executing the 'test_demo' script?
Best regards,
Ronen.

(gaddpg) nir1tv@TV2ZOSD15:~/projects/GA-DDPG$ bash experiments/scripts/test_demo.sh
+ set -e
+++ dirname -- experiments/scripts/test_demo.sh
++ cd experiments/scripts
++ pwd
+ DIR=/home/nir1tv/projects/GA-DDPG/experiments/scripts
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ LOG_NAME=agent
+ LOG=output//log.txt
+ MODEL_NAME=dummy
+ RUN_NUM=3
+ EPI_NUM=165
+ EPOCH=latest
++ date +%Y-%m-%d_%H-%M-%S
+ LOG=outputs/dummy/test_log.txt.2021-07-18_12-34-27
+ exec
++ tee -a outputs/dummy/test_log.txt.2021-07-18_12-34-27
tee: outputs/dummy/test_log.txt.2021-07-18_12-34-27: No such file or directory
+ echo Logging output to outputs/dummy/test_log.txt.2021-07-18_12-34-27
Logging output to outputs/dummy/test_log.txt.2021-07-18_12-34-27
+ python -m core.train_test_offline --expert --pretrained output/demo_model --test --render
pybullet build time: Jul 14 2021 10:11:29
Output will be saved to `output/demo_model`
Using config:
{'DATA_ROOT_DIR': 'data/scenes',
 'EPOCHS': 200,
 'EXPERIMENT_OBJ_INDEX_DIR': 'experiments/object_index',
 'IMG_SIZE': [112, 112],
 'LOG': True,
 'MODEL_SPEC_DIR': 'experiments/model_spec',
 'OBJECT_DATA_DIR': 'data/objects',
 'OFFLINE_BATCH_SIZE': 100,
 'OFFLINE_RL_MEMORY_SIZE': 100000,
 'ONPOLICY_MEMORY_SIZE': -1,
 'OUTPUT_DIR': 'output',
 'OUTPUT_MISC_DIR': 'output_misc',
 'RL_DATA_ROOT_DIR': 'data/scenes',
 'RL_IMG_SIZE': [112, 112],
 'RL_MAX_STEP': 20,
 'RL_MEMORY_SIZE': 2000000,
 'RL_MODEL_SPEC': 'output/demo_model/rl_pointnet_model_spec.yaml',
 'RL_SAVE_DATA_NAME': 'data_50k.npz',
 'RL_SAVE_DATA_ROOT_DIR': 'data',
 'RL_TEST_SCENE': 'data/gaddpg_scenes',
 'RL_TRAIN': {'DAGGER_MAX_STEP': 18,
              'DAGGER_MIN_STEP': 5,
              'DAGGER_RATIO': 0.5,
              'DART_MAX_STEP': 18,
              'DART_MIN_STEP': 5,
              'DART_RATIO': 0.5,
              'ENV_FAR': 0.5,
              'ENV_NEAR': 0.2,
              'ENV_RESET_TRIALS': 7,
              'EXPERT_INIT_MAX_STEP': 15,
              'EXPERT_INIT_MIN_STEP': 3,
              'RL': True,
              'SAVE_EPISODE_INTERVAL': 50,
              'accumulate_points': True,
              'action_noise': 0.01,
              'batch_size': 125,
              'bc_reward_flag': False,
              'buffer_full_size': -1,
              'buffer_start_idx': 0,
              'change_dynamics': False,
              'channel_num': 5,
              'clip_grad': 0.5,
              'concat_option': 'point_wise',
              'critic_aux': True,
              'critic_extra_latent': -1,
              'critic_goal': False,
              'dagger': True,
              'dart': True,
              'ddpg_coefficients': [0.5, 0.001, 1.0003, 1.0, 0.2],
              'domain_randomization': False,
              'env_name': 'PandaYCBEnv',
              'env_num_objs': 1,
              'expert_initial_state': False,
              'explore_cap': 1.0,
              'explore_ratio': 1.0,
              'explore_ratio_list': [0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7],
              'feature_input_dim': 512,
              'fill_data_step': 10,
              'fix_timestep_test': True,
              'gamma': 0.95,
              'goal_reward_flag': False,
              'head_lr': 0.0003,
              'hidden_size': 256,
              'index_file': 'experiments/object_index/extra_shape.json',
              'index_split': 'train',
              'init_distance_high': 0.45,
              'init_distance_low': 0.15,
              'load_buffer': False,
              'load_obj_num': 40,
              'load_scene_joint': False,
              'load_test_scene_new': False,
              'log': True,
              'lr': 0.0003,
              'lr_gamma': 0.5,
              'max_epoch': 150000,
              'max_num_pts': 20000,
              'mix_milestones': [4000,
                                 8000,
                                 16000,
                                 25000,
                                 35000,
                                 45000,
                                 65000,
                                 85000,
                                 100000,
                                 120000],
              'mix_policy_ratio_list': [0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2],
              'mix_value_ratio_list': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
              'new_scene': True,
              'noise_ratio_list': [3.0, 2.5, 2.0, 1.5, 1.2, 1.2, 1, 0.8, 0.5],
              'noise_type': 'uniform',
              'num_remotes': 8,
              'off_policy': True,
              'online_buffer_ratio': 0.7,
              'onpolicy': True,
              'overwrite_feat_milestone': [],
              'policy_aux': True,
              'policy_extra_latent': -1,
              'policy_goal': False,
              'policy_milestones': [20000, 40000, 60000, 80000],
              'policy_update_gap': 2,
              'pt_accumulate_ratio': 0.95,
              'refill_buffer': True,
              'reinit_factor': 3,
              'reinit_lr': 0.0001,
              'reinit_optim': False,
              'sa_channel_concat': True,
              'save_epoch': [3000,
                             10000,
                             20000,
                             40000,
                             80000,
                             100000,
                             140000,
                             180000,
                             200000],
              'self_supervision': False,
              'shared_feature': False,
              'shared_objects_across_worker': False,
              'target_update_interval': 3000,
              'tau': 0.0001,
              'train_feature': True,
              'train_goal_feature': False,
              'train_value_feature': True,
              'uniform_num_pts': 1024,
              'updates_per_step': 20,
              'use_action_limit': True,
              'use_expert_plan': False,
              'use_image': False,
              'use_point_state': True,
              'use_time': True,
              'value_lr': 0.0003,
              'value_lr_gamma': 0.5,
              'value_milestones': [20000, 40000, 60000, 80000],
              'value_model': True,
              'visdom': True},
 'RNG_SEED': 3,
 'ROBOT_DATA_DIR': 'data/robots',
 'ROOT_DIR': '/home/nir1tv/projects/GA-DDPG/experiments/../',
 'SCRIPT_FOLDER': 'experiments/cfgs',
 'env_config': {'accumulate_points': True,
                'action_space': 'task6d',
                'change_dynamics': False,
                'data_type': 'RGBDM',
                'domain_randomization': False,
                'expert_step': 20,
                'height': 112,
                'img_resize': [112, 112],
                'initial_far': 0.5,
                'initial_near': 0.2,
                'numObjects': 1,
                'omg_config': {'allow_collision_point': 0,
                               'base_obstacle_weight': 1.0,
                               'clearance': 0.03,
                               'dynamic_timestep': False,
                               'extra_smooth_steps': 5,
                               'goal_idx': -1,
                               'ik_clearance': 0.07,
                               'ik_parallel': False,
                               'ik_seed_num': 13,
                               'increment_iks': True,
                               'ol_alg': 'Proj',
                               'optim_steps': 1,
                               'pre_terminate': True,
                               'root_dir': '/home/nir1tv/projects/GA-DDPG/experiments/../',
                               'scene_file': '',
                               'silent': True,
                               'smoothness_base_weight': 3,
                               'standoff_dist': 0.08,
                               'target_epsilon': 0.06,
                               'target_hand_filter_angle': 90,
                               'target_obj_collision': 1,
                               'terminate_smooth_loss': 3,
                               'timesteps': 20,
                               'traj_delta': 0.05,
                               'traj_init': 'grasp',
                               'traj_interpolate': 'linear',
                               'traj_max_step': 26,
                               'traj_min_step': 15,
                               'use_expert_plan': False,
                               'vis': False},
                'pt_accumulate_ratio': 0.95,
                'random_target': True,
                'regularize_pc_point_count': True,
                'uniform_num_pts': 1024,
                'use_hand_finger_point': True,
                'width': 112},
 'omg_config': {'allow_collision_point': 0,
                'base_obstacle_weight': 1.0,
                'clearance': 0.03,
                'dynamic_timestep': False,
                'extra_smooth_steps': 5,
                'goal_idx': -1,
                'ik_clearance': 0.07,
                'ik_parallel': False,
                'ik_seed_num': 13,
                'increment_iks': True,
                'ol_alg': 'Proj',
                'optim_steps': 1,
                'pre_terminate': True,
                'root_dir': '/home/nir1tv/projects/GA-DDPG/experiments/../',
                'scene_file': '',
                'silent': True,
                'smoothness_base_weight': 3,
                'standoff_dist': 0.08,
                'target_epsilon': 0.06,
                'target_hand_filter_angle': 90,
                'target_obj_collision': 1,
                'terminate_smooth_loss': 3,
                'timesteps': 20,
                'traj_delta': 0.05,
                'traj_init': 'grasp',
                'traj_interpolate': 'linear',
                'traj_max_step': 26,
                'traj_min_step': 15,
                'use_expert_plan': False,
                'vis': False},
 'pretrained_time': '',
 'script_name': 'td3_critic_aux_policy_aux.yaml'}
Let's use 2 GPUs!
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
Output will be saved to `output/demo_model`
video output: YCB_td3_critic_aux_policy_aux.yaml stat output: rollout_success.script_td3_critic_aux_policy_aux.yaml.txt
load pretrained policy!!!!
load pretrained critic!!!!
load feat optim
load pretrained feature!!!! from: output/demo_model/DDPG_state_feat_PandaYCBEnv_latest step :249921
output_time: demo_model logdir: output/demo_model/PandaYCBEnv_DDPG
startThreads creating 1 threads.
starting thread 0
started thread 0 
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 470.42.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 470.42.01
Vendor = NVIDIA Corporation
Renderer = NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0 
MotionThreadFunc thread started
numActiveThreads = 0
stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed
finished
numActiveThreads = 0
btShutDownExampleBrowser stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed
startThreads creating 1 threads.
starting thread 0
started thread 0 
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 470.42.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 470.42.01
Vendor = NVIDIA Corporation
Renderer = NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0 
MotionThreadFunc thread started
ven = NVIDIA Corporation
ven = NVIDIA Corporation
/home/nir1tv/projects/GA-DDPG/env/models
b3Printf: b3Warning[examples/Importers/ImportURDFDemo/UrdfFindMeshFile.h,102]:

b3Printf: /home/nir1tv/projects/GA-DDPG/env/models/panda/panda_gripper_hand_camera.urdf:8: cannot find 'meshes/collision/link0.obj' in any directory in urdf path

Warning: b3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]:

Warning: Could not parse visual element for Link:
Warning: b3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]:

Warning: panda_link0
Warning: b3Error[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,121]:

Warning: failed to parse link
Traceback (most recent call last):
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/nir1tv/projects/GA-DDPG/core/train_test_offline.py", line 385, in <module>
    state = env.reset(  save=False, data_root_dir=cfg.DATA_ROOT_DIR,  enforce_face_target=True)
  File "/home/nir1tv/projects/GA-DDPG/env/panda_scene.py", line 258, in reset
    self._panda = Panda(stepsize=self._timeStep, base_shift=self._shift)
  File "/home/nir1tv/projects/GA-DDPG/env/panda_gripper_hand_camera.py", line 31, in __init__
    flags=p.URDF_USE_SELF_COLLISION)
pybullet.error: Cannot load URDF file.

About the pre-trained model

Hi, can you provide the complete pre-trained model(not the demo one)?

About Hindsight Goals For Fine-tuning on Unknown Objects

Hi Liruiw:
I noticed that in your paper, you used hindsight goals for fine-tuning on unknown objects, e.g., YCB objects, and improved the success rate from 88.2% to 93.5% on the YCB testing dataset. I'm quite interested in this method, but I didn't find its implementation in this repository. Could you please show me where it is? Thanks!

core.env_planner is missing

The file GA-DDPG\core\test_realworld_ros_final.py in line 43 is importing core.env_planner. But, no such file exists in the core folder.

"ycb_large.json" file is missing

Hello,
While trying to run the example:
bash experiments/scripts/test_demo.sh
I get the following error:

+ set -e
+++ dirname -- experiments/scripts/test_demo.sh
++ cd experiments/scripts
++ pwd
+ DIR=/home/nir1tv/projects/GA-DDPG/experiments/scripts
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ LOG_NAME=agent
+ LOG=output//log.txt
+ MODEL_NAME=dummy
+ RUN_NUM=3
+ EPI_NUM=165
+ EPOCH=latest
++ date +%Y-%m-%d_%H-%M-%S
+ LOG=outputs/dummy/test_log.txt.2021-07-14_10-39-15
+ exec
++ tee -a outputs/dummy/test_log.txt.2021-07-14_10-39-15
tee: outputs/dummy/test_log.txt.2021-07-14_10-39-15: No such file or directory
+ echo Logging output to outputs/dummy/test_log.txt.2021-07-14_10-39-15
Logging output to outputs/dummy/test_log.txt.2021-07-14_10-39-15
+ python -m core.train_test_offline --expert --pretrained output/demo_model --test --render
pybullet build time: Jul 14 2021 10:11:29
Output will be saved to `output/demo_model`
Using config:
{'DATA_ROOT_DIR': 'data/scenes',
 'EPOCHS': 200,
.....
.....
Let's use 2 GPUs!
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
Output will be saved to `output/demo_model`
video output: YCB_td3_critic_aux_policy_aux.yaml stat output: rollout_success.script_td3_critic_aux_policy_aux.yaml.txt
load pretrained policy!!!!
load pretrained critic!!!!
load feat optim
load pretrained feature!!!! from: output/demo_model/DDPG_state_feat_PandaYCBEnv_latest step :249921
output_time: demo_model logdir: output/demo_model/PandaYCBEnv_DDPG
Traceback (most recent call last):
  File "/home/nir1tv/miniconda3/envs/gaux/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/nir1tv/miniconda3/envs/gaux/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/nir1tv/projects/GA-DDPG/core/train_test_offline.py", line 371, in <module>
    with open(file) as f: file_dir = json.load(f)
FileNotFoundError: [Errno 2] No such file or directory: 'experiments/object_index/ycb_large.json'

Note that I have executed all of the download_* sh scripts from ./experiments/script folder.

Cheers,
Ronen.

How to move the target objects?

If I want to change the static objects to moving, where can I add pybullet code to advise? Thank u!

CUDA Problem

Hi Lirui,
Thanks for publishing this excellent work.
I'm trying to run in real life with realsense similar to your videos, but it hasn't been possible because when I tried to install Pointnet++ i got the error: The detected CUDA version (11.4) mismatches the version that was used to compile PyTorch (10.2). And my GPU supports CUDA 11 and up. Have you thought about migrating it to a newer version? Or do you think that the current version can be run with CUDA 11?

Could you please provide guidance for bridge the gap between native ros installed (home) direectory and virtulenv GA-DDPG directory.

I am struggling with configuring the to bridge the gap between ros melodic and anaconda virtual environment. There is a compatibility issues with python 2.7, while 3.6 python version raising compatibility issues with ROS installed on my native machine? How could be handle this issue and if possible make clear the setup instruction, as that is not intuitive for newbie like me?

TypeError: forward() missing 1 required positional argument: 'pc' Error in the feature extractor

Hello,
It seems that there's a problem in the state_feature_extractor in the agent.py file.
Could you please see the following errors when executing the 'test_demo' script?
Best regards,
Ronen.

+ echo Logging output to outputs/dummy/test_log.txt.2021-07-20_10-26-24
Logging output to outputs/dummy/test_log.txt.2021-07-20_10-26-24
+ python -m core.train_test_offline --expert --pretrained output/demo_model --test --render
pybullet build time: Jul 14 2021 10:11:29
Output will be saved to `output/demo_model`
Using config:
{'DATA_ROOT_DIR': 'data/scenes',
 'EPOCHS': 200,
 'EXPERIMENT_OBJ_INDEX_DIR': 'experiments/object_index',
 'IMG_SIZE': [112, 112],
 'LOG': True,
 'MODEL_SPEC_DIR': 'experiments/model_spec',
 'OBJECT_DATA_DIR': 'data/objects',
 'OFFLINE_BATCH_SIZE': 100,
 'OFFLINE_RL_MEMORY_SIZE': 100000,
 'ONPOLICY_MEMORY_SIZE': -1,
 'OUTPUT_DIR': 'output',
 'OUTPUT_MISC_DIR': 'output_misc',
 'RL_DATA_ROOT_DIR': 'data/scenes',
 'RL_IMG_SIZE': [112, 112],
 'RL_MAX_STEP': 20,
 'RL_MEMORY_SIZE': 2000000,
 'RL_MODEL_SPEC': 'output/demo_model/rl_pointnet_model_spec.yaml',
 'RL_SAVE_DATA_NAME': 'data_50k.npz',
 'RL_SAVE_DATA_ROOT_DIR': 'data',
 'RL_TEST_SCENE': 'data/gaddpg_scenes',
 'RL_TRAIN': {'DAGGER_MAX_STEP': 18,
              'DAGGER_MIN_STEP': 5,
              'DAGGER_RATIO': 0.5,
              'DART_MAX_STEP': 18,
              'DART_MIN_STEP': 5,
              'DART_RATIO': 0.5,
              'ENV_FAR': 0.5,
              'ENV_NEAR': 0.2,
              'ENV_RESET_TRIALS': 7,
              'EXPERT_INIT_MAX_STEP': 15,
              'EXPERT_INIT_MIN_STEP': 3,
              'RL': True,
              'SAVE_EPISODE_INTERVAL': 50,
              'accumulate_points': True,
              'action_noise': 0.01,
              'batch_size': 125,
              'bc_reward_flag': False,
              'buffer_full_size': -1,
              'buffer_start_idx': 0,
              'change_dynamics': False,
              'channel_num': 5,
              'clip_grad': 0.5,
              'concat_option': 'point_wise',
              'critic_aux': True,
              'critic_extra_latent': -1,
              'critic_goal': False,
              'dagger': True,
              'dart': True,
              'ddpg_coefficients': [0.5, 0.001, 1.0003, 1.0, 0.2],
              'domain_randomization': False,
              'env_name': 'PandaYCBEnv',
              'env_num_objs': 1,
              'expert_initial_state': False,
              'explore_cap': 1.0,
              'explore_ratio': 1.0,
              'explore_ratio_list': [0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7],
              'feature_input_dim': 512,
              'fill_data_step': 10,
              'fix_timestep_test': True,
              'gamma': 0.95,
              'goal_reward_flag': False,
              'head_lr': 0.0003,
              'hidden_size': 256,
              'index_file': 'experiments/object_index/extra_shape.json',
              'index_split': 'train',
              'init_distance_high': 0.45,
              'init_distance_low': 0.15,
              'load_buffer': False,
              'load_obj_num': 40,
              'load_scene_joint': False,
              'load_test_scene_new': False,
              'log': True,
              'lr': 0.0003,
              'lr_gamma': 0.5,
              'max_epoch': 150000,
              'max_num_pts': 20000,
              'mix_milestones': [4000,
                                 8000,
                                 16000,
                                 25000,
                                 35000,
                                 45000,
                                 65000,
                                 85000,
                                 100000,
                                 120000],
              'mix_policy_ratio_list': [0.1, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2],
              'mix_value_ratio_list': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
              'new_scene': True,
              'noise_ratio_list': [3.0, 2.5, 2.0, 1.5, 1.2, 1.2, 1, 0.8, 0.5],
              'noise_type': 'uniform',
              'num_remotes': 8,
              'off_policy': True,
              'online_buffer_ratio': 0.7,
              'onpolicy': True,
              'overwrite_feat_milestone': [],
              'policy_aux': True,
              'policy_extra_latent': -1,
              'policy_goal': False,
              'policy_milestones': [20000, 40000, 60000, 80000],
              'policy_update_gap': 2,
              'pt_accumulate_ratio': 0.95,
              'refill_buffer': True,
              'reinit_factor': 3,
              'reinit_lr': 0.0001,
              'reinit_optim': False,
              'sa_channel_concat': True,
              'save_epoch': [3000,
                             10000,
                             20000,
                             40000,
                             80000,
                             100000,
                             140000,
                             180000,
                             200000],
              'self_supervision': False,
              'shared_feature': False,
              'shared_objects_across_worker': False,
              'target_update_interval': 3000,
              'tau': 0.0001,
              'train_feature': True,
              'train_goal_feature': False,
              'train_value_feature': True,
              'uniform_num_pts': 1024,
              'updates_per_step': 20,
              'use_action_limit': True,
              'use_expert_plan': False,
              'use_image': False,
              'use_point_state': True,
              'use_time': True,
              'value_lr': 0.0003,
              'value_lr_gamma': 0.5,
              'value_milestones': [20000, 40000, 60000, 80000],
              'value_model': True,
              'visdom': True},
 'RNG_SEED': 3,
 'ROBOT_DATA_DIR': 'data/robots',
 'ROOT_DIR': '/home/nir1tv/projects/GA-DDPG/experiments/../',
 'SCRIPT_FOLDER': 'experiments/cfgs',
 'env_config': {'accumulate_points': True,
                'action_space': 'task6d',
                'change_dynamics': False,
                'data_type': 'RGBDM',
                'domain_randomization': False,
                'expert_step': 20,
                'height': 112,
                'img_resize': [112, 112],
                'initial_far': 0.5,
                'initial_near': 0.2,
                'numObjects': 1,
                'omg_config': {'allow_collision_point': 0,
                               'base_obstacle_weight': 1.0,
                               'clearance': 0.03,
                               'dynamic_timestep': False,
                               'extra_smooth_steps': 5,
                               'goal_idx': -1,
                               'ik_clearance': 0.07,
                               'ik_parallel': False,
                               'ik_seed_num': 13,
                               'increment_iks': True,
                               'ol_alg': 'Proj',
                               'optim_steps': 1,
                               'pre_terminate': True,
                               'root_dir': '/home/nir1tv/projects/GA-DDPG/experiments/../',
                               'scene_file': '',
                               'silent': True,
                               'smoothness_base_weight': 3,
                               'standoff_dist': 0.08,
                               'target_epsilon': 0.06,
                               'target_hand_filter_angle': 90,
                               'target_obj_collision': 1,
                               'terminate_smooth_loss': 3,
                               'timesteps': 20,
                               'traj_delta': 0.05,
                               'traj_init': 'grasp',
                               'traj_interpolate': 'linear',
                               'traj_max_step': 26,
                               'traj_min_step': 15,
                               'use_expert_plan': False,
                               'vis': False},
                'pt_accumulate_ratio': 0.95,
                'random_target': True,
                'regularize_pc_point_count': True,
                'uniform_num_pts': 1024,
                'use_hand_finger_point': True,
                'width': 112},
 'omg_config': {'allow_collision_point': 0,
                'base_obstacle_weight': 1.0,
                'clearance': 0.03,
                'dynamic_timestep': False,
                'extra_smooth_steps': 5,
                'goal_idx': -1,
                'ik_clearance': 0.07,
                'ik_parallel': False,
                'ik_seed_num': 13,
                'increment_iks': True,
                'ol_alg': 'Proj',
                'optim_steps': 1,
                'pre_terminate': True,
                'root_dir': '/home/nir1tv/projects/GA-DDPG/experiments/../',
                'scene_file': '',
                'silent': True,
                'smoothness_base_weight': 3,
                'standoff_dist': 0.08,
                'target_epsilon': 0.06,
                'target_hand_filter_angle': 90,
                'target_obj_collision': 1,
                'terminate_smooth_loss': 3,
                'timesteps': 20,
                'traj_delta': 0.05,
                'traj_init': 'grasp',
                'traj_interpolate': 'linear',
                'traj_max_step': 26,
                'traj_min_step': 15,
                'use_expert_plan': False,
                'vis': False},
 'pretrained_time': '',
 'script_name': 'td3_critic_aux_policy_aux.yaml'}
Let's use 2 GPUs!
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
schedule: [8000, 16000, 30000, 50000, 70000, 90000]
Output will be saved to `output/demo_model`
video output: YCB_td3_critic_aux_policy_aux.yaml stat output: rollout_success.script_td3_critic_aux_policy_aux.yaml.txt
load pretrained policy!!!!
load pretrained critic!!!!
load feat optim
load pretrained feature!!!! from: output/demo_model/DDPG_state_feat_PandaYCBEnv_latest step :249921
output_time: demo_model logdir: output/demo_model/PandaYCBEnv_DDPG
startThreads creating 1 threads.
starting thread 0
started thread 0 
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 470.42.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 470.42.01
Vendor = NVIDIA Corporation
Renderer = NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0 
MotionThreadFunc thread started
numActiveThreads = 0
stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed
finished
numActiveThreads = 0
btShutDownExampleBrowser stopping threads
Thread with taskId 0 exiting
Thread TERMINATED
destroy semaphore
semaphore destroyed
destroy main semaphore
main semaphore destroyed
startThreads creating 1 threads.
starting thread 0
started thread 0 
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 470.42.01
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 470.42.01
Vendor = NVIDIA Corporation
Renderer = NVIDIA GeForce RTX 2080 Ti/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0 
MotionThreadFunc thread started
ven = NVIDIA Corporation
ven = NVIDIA Corporation
/home/nir1tv/projects/GA-DDPG/env/models
>>>> target name: 061_foam_brick
==== loaded scene: scene_0 target: 006_mustard_bottle idx: 5 init joint
Traceback (most recent call last):
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/nir1tv/projects/GA-DDPG/core/train_test_offline.py", line 387, in <module>
    test(run_iter=run_iter)
  File "/home/nir1tv/projects/GA-DDPG/core/train_test_offline.py", line 230, in test
    action, _, _, aux_pred = agent.select_action(state, vis=False, remain_timestep=remain_timestep )
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/nir1tv/projects/GA-DDPG/core/agent.py", line 113, in select_action
    train=False,
  File "/home/nir1tv/projects/GA-DDPG/core/ddpg.py", line 54, in extract_feature
    train=train)
  File "/home/nir1tv/projects/GA-DDPG/core/agent.py", line 77, in unpack_batch
    train=train,
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in replica 1 on device 1.
Original Traceback (most recent call last):
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/home/nir1tv/miniconda3/envs/gaddpg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'pc'

numActiveThreads = 0
stopping threads
Thread with taskId 0 exiting
Thread TERMINATED
destroy semaphore
semaphore destroyed
destroy main semaphore
main semaphore destroyed
finished
numActiveThreads = 0
btShutDownExampleBrowser stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed

URDF file cannot load

Hello! When I tried to run the pretrained model demo，I met some problems. I have downloaded the file folder shared_data and demo_model. However, when I run the test procedure, a error called URDF file '/root/GA-DDPG/env/../data/objects/Desktop_94145560501fa53228212dd5e8de73b/model_normalized.urdf' not found. I have checked the whole files contained in the data and there are not such a file.

With due respect, I awaits eagerly. I am struggling for long time now with DA-DDPG repo!

Sir the instructions for the cloning the repo is not that intuitive, that make me feel confuse. And I am stuck here in downloading the data, "Download environment data bash experiments/scripts/download_data.sh"
Please make these instructions, intuitive, as here kept mentioning about the conda envrionment use and reuse, and also mentioned that downlaod again the GA-DDPG?
Installation

git clone https://github.com/liruiw/GA-DDPG.git --recursive

Setup: Ubuntu 16.04 or above, CUDA 10.0 or above, python 2.7 / 3.6

(Required for Training) - Install [OMG](https://github.com/liruiw/OMG-Planner) submodule and **reuse conda environment.**
(Docker) See [OMG Docker](https://github.com/liruiw/OMG-Planner#docker-setup) for details.
(Demo) - **Install GA-DDPG inside a new conda environment**

conda create --name gaddpg python=3.6.9
conda activate gaddpg
pip install -r requirements.txt

Install PointNet++

For me First I download the the recursive repo (git clone https://github.com/liruiw/GA-DDPG.git --recursive) with in anaconda, once downloaded then did this step (conda create --name gaddpg python=3.6.9 conda activate gaddpg pip install -r requirements.txt) and then downloaded this succesfully (Install [PointNet++])
but the error I am getting here (Download environment data bash experiments/scripts/download_data.sh)
The error is ((gaddpg) siat@siat-Precision-3640-Tower:~/anaconda3/GA-DDPG$ bash experiments/scripts/download_data.sh
WARNING: combining -O with -r or -p will mean that all downloaded content
will be placed in the single file you specified.

--2024-01-16 16:20:17-- https://drive.google.com/uc?export=download&id=136rLjyjFFRMyVxUZT6txB5XR2Ct_LNWC
Connecting to 127.0.0.1:8889... connected.
Proxy request sent, awaiting response... 303 See Other
Location: https://drive.usercontent.google.com/download?id=136rLjyjFFRMyVxUZT6txB5XR2Ct_LNWC&export=download [following]
--2024-01-16 16:20:19-- https://drive.usercontent.google.com/download?id=136rLjyjFFRMyVxUZT6txB5XR2Ct_LNWC&export=download
Connecting to 127.0.0.1:8889... connected.
Proxy request sent, awaiting response... 200 OK
Length: 2427 (2.4K) [text/html]
Saving to: ‘data.zip’

data.zip 100%[========================================================================================>] 2.37K --.-KB/s in 0s

2024-01-16 16:20:21 (44.9 MB/s) - ‘data.zip’ saved [2427/2427]

FINISHED --2024-01-16 16:20:21--
Total wall clock time: 3.3s
Downloaded: 1 files, 2.4K in 0s (44.9 MB/s)
Data downloaded. Starting to unzip
Archive: data.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of data.zip or
data.zip.zip, and cannot find data.zip.ZIP, period.
)
Thanks for your amazing work, please help me with this if I could contribute and level it up. Thanks in advance.

How to save the output video

Dear Liruiw:
I'm confused about how to save video after test.

Actually, in the file folder output_misc, there are some dictionary that are created after running test program.

But when I terminate the program, nothing had been saved. Should I need to type some order in Ipython order line?

How to run GA-DDPG in real world?

Dear Lirui:
Thanks for sharing so nice work. Could you mind providing more details how to run your GA-PPDG in real world?
Specifically, how to do the steps to transfer GA-DDPG in Pybullet to real Franka robot?

The error of running Save Data and Offline Training

Dear Lirui:

I meet some questions when to reproduce your code, could you help to provide some valuable advice?

When I run the code bash ./experiments/scripts/train_online_save_buffer.sh bc_save_data.yaml BC for Save Data and Offline Training. I meet the error:
ValueError: After taking into account object store and redis memory usage, the amount of memory on this node available for tasks and actors (-11.76 GB) is less than -34% of total. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
Furthermore, I meet the same error to run the code Online Training and Testing.
Finally, I want to known it is right in your code bash ./experiments/scripts/train_online_visdom.sh td3_critic_aux_policy_aux.yaml DDPG of train_online_visdom.sh to be train_online.sh or train_online_continue.sh? Because I cannot find the train_online_visdom.sh in your published code.

About demo model

Dear liruiw:

I tried to continue training from output/demo_model using the command

PYTHONUNBUFFERED=True CUDA_VISIBLE_DEVICES=0,1,2,3 python -m core.train_online --save_model --config_file td3_critic_aux_policy_aux.yaml --policy DDPG --log --fix_output_time ddpg_from_demo_model --seed 233 --max_epoch 400000 --pretrained output/demo_model

But I got this error message

  File ".../GA-DDPG/core/train_online.py", line 259, in rollout
    rest_expert_plan, _ = self.env.expert_plan(step=int(MAX_STEP-step-1))
  File ".../GA-DDPG/env/panda_scene.py", line 766, in expert_plan
    info = self.planner_scene.step()
  File ".../GA-DDPG/OMG/omg/core.py", line 701, in step
    plan = self.planner.plan(self.traj)
  File ".../GA-DDPG/OMG/omg/planner.py", line 621, in plan
    self.info.append(self.optim.optimize(traj, force_update=True))
  File ".../GA-DDPG/OMG/omg/optimizer.py", line 134, in optimize
    update = self.goal_set_projection(traj, grad)
  File ".../GA-DDPG/OMG/omg/optimizer.py", line 107, in goal_set_projection
    target_goal_diff = cur_end_point - chosen_goal
ValueError: operands could not be broadcast together with shapes (4,9) (5,9)

It seems that the length of traj.data is smaller than constraint_num and the length of chosen_goal.

But if I continue training from the models trained using train_online.sh, there won't be any errors.

Besides, I managed to continue training from output/demo anyway, by setting constraint_num smaller and truncating chosen_goal from the back in order to align with the length of cur_end_point. But the success rate decreased from 94.1 to 88.1, as shown below:

demo model
Avg. Performance: (Return: 0.941 +- 0.02778) (Success: 0.941 +- 0.02778)
+---------------------+---------+-----------+
| object name         |   count |   success |
|---------------------+---------+-----------|
| 003_cracker_box     |      30 |        22 |
| 004_sugar_box       |      30 |        27 |
| 005_tomato_soup_can |      30 |        27 |
| 006_mustard_bottle  |      30 |        30 |
| 010_potted_meat_can |      30 |        30 |
| 021_bleach_cleanser |      30 |        30 |
| 024_bowl            |      30 |        30 |
| 025_mug             |      30 |        29 |
| 061_foam_brick      |      30 |        29 |
+---------------------+---------+-----------+

continue training from demo model
Avg. Performance: (Return: 0.881 +- 0.00556) (Success: 0.881 +- 0.00556)
+---------------------+---------+-----------+
| object name         |   count |   success |
|---------------------+---------+-----------|
| 003_cracker_box     |      30 |        22 |
| 004_sugar_box       |      30 |        26 |
| 005_tomato_soup_can |      30 |        28 |
| 006_mustard_bottle  |      30 |        30 |
| 010_potted_meat_can |      30 |        25 |
| 021_bleach_cleanser |      30 |        30 |
| 024_bowl            |      30 |        29 |
| 025_mug             |      30 |        20 |
| 061_foam_brick      |      30 |        28 |
+---------------------+---------+-----------+

It seems quite strange to me that it was OMGplanner instead of GA-DDPG that reported an error. Are there some subtle differences between output/demo_model and the models trained using train_online.sh?