nvlabs / diffrl Goto Github PK

View Code? Open in Web Editor NEW

248.0 9.0 40.0 25.01 MB

[ICLR 2022] Accelerated Policy Learning with Parallel Differentiable Simulation

Home Page: https://short-horizon-actor-critic.github.io/

License: Other

Python 90.52% C++ 3.77% MATLAB 0.22% C 5.31% Makefile 0.08% Batchfile 0.09% Shell 0.01%

reinforcement-learning robotic-control differentiable-simulation

diffrl's Issues

How to set collision between non-connected dynamic bodies?

here on the video , non of the agents have proper collsion between their legs in the default physics engine.
How can I set self collision between non-connected dynamic bodies?

sample video
https://youtu.be/T63vY8ZCFxo

What is MM_caching_frequency?

Hello, thank you for open-sourcing all this very interesting work. I don't understand what "MM_caching_frequency" is and why it changes throughout the configuration files.

Example Request - Cartpole/Ant using Warp instead of dFlex

Hi, first of all, thanks a lot for your great piece of software! We are really excited to apply it in our research in surgical simulation. However, we ran into problems while trying to switch dFlex to its successor - Warp.

Would it be possible to provide us with a minimal reference Cartpole and/or Ant SHAC example using Warp? That would be very helpful not only for our group but also for other users.

Thank you

Error when python test_env.py --env AntEnv

Excuse me, I met such problem when I try the command python test_env.py --env AntEnv in the folder examples as the guide
The version of my Pytorch is 1.11.0, cuda is 12.1
Is there anything wrong with my system? I'll appreciate it a lot if you can help me with this problem.

Rebuilding kernels
Detected CUDA files, patching ldflags
Emitting ninja build file /home/frank/DiffRL/dflex/dflex/kernels/build.ninja...
Building extension module kernels...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/local/cuda-12.1/bin/nvcc  -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -
DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -
I/home/frank/DiffRL/dflex/dflex -isystem /home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/TH -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-12.1/include -isystem 
/home/frank/anaconda3/envs/shac/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -
D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-
relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-
fPIC' -gencode=arch=compute_35,code=compute_35 -std=c++14 -c /home/frank/DiffRL/dflex/dflex/kernels/cuda.cu -o cuda.cuda.o
FAILED: cuda.cuda.o
/usr/local/cuda-12.1/bin/nvcc  -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -
DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -
I/home/frank/DiffRL/dflex/dflex -isystem /home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/TH -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-12.1/include -isystem 
/home/frank/anaconda3/envs/shac/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -
D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-
relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-
fPIC' -gencode=arch=compute_35,code=compute_35 -std=c++14 -c /home/frank/DiffRL/dflex/dflex/kernels/cuda.cu -o cuda.cuda.o
nvcc fatal   : Unsupported gpu architecture 'compute_35'
[2/3] c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -
DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -
I/home/frank/DiffRL/dflex/dflex -isystem /home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/TH -isystem 
/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-12.1/include -isystem 
/home/frank/anaconda3/envs/shac/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Z -O2 -DNDEBUG -c 
/home/frank/DiffRL/dflex/dflex/kernels/main.cpp -o main.o
/home/frank/DiffRL/dflex/dflex/kernels/main.cpp: In function ‘df::float3 box_sdf_grad_cpu_func(df::float3, df::float3)’:
/home/frank/DiffRL/dflex/dflex/kernels/main.cpp:1051:47: warning: control reaches end of non-void function [-Wreturn-type]
 1051 |     var_58 = df::select(var_56, var_53, var_57);
          |
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1740, in _run_ninja_build
    subprocess.run(
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test_env.py", line 17, in <module>                                                                                                                                                       
    import envs
  File "/home/frank/DiffRL/envs/__init__.py", line 8, in <module>                                                                                                                                        
    from envs.dflex_env import DFlexEnv                                                                                                                                                        
  File "/home/frank/DiffRL/envs/dflex_env.py", line 15, in <module>                                                                                                                              
    import dflex as df                                                                                                                                                                         
  File "/home/frank/DiffRL/dflex/dflex/__init__.py", line 15, in <module>                                                                                                                            
    kernel_init()                                                                                                                                                                              
  File "/home/frank/DiffRL/dflex/dflex/sim.py", line 67, in kernel_init                                                                                                                          
    kernels = df.compile()                                                                                                                                                                     
  File "/home/frank/DiffRL/dflex/dflex/adjoint.py", line 1865, in compile                                                                                                                        
    module = torch.utils.cpp_extension.load_inline('kernels',                                                                                                                                  
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1293, in load_inline                                                                     
    return _jit_compile(                                                                                                                                                                       
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1357, in _jit_compile                                                                    
    _write_ninja_file_and_build_library(                                                                                                                                                       
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1469, in _write_ninja_file_and_build_library
    _run_ninja_build(                                                                                                                                                                          
  File "/home/frank/anaconda3/envs/shac/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build                                                                
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'kernels'

what should I do to set z-axis up

Now that the released code is y-axis up, I want to set z-axis up in simulation. I set contact normal vector to (0,0,1) rather than (0,1,0), and gravity as (0, 0, -9.8). Also, I replace the expression vyny with vznz, vy with vz, ny with nz, etc, mainly in files sim.py and model.py. However, the result dynamics is not the same as the original setting. What more should I do?

SHAC and policies for partial observability

I was wondering if you have made any attempt at combining SHAC with an LSTM or transformer policy, or some policy that effectively can reason about some history of states, rather than just the current one; as is desirable for instance when dealing with partial observability of the state.

While conceptually it does not sound too complicated, I know that getting the implementation details right can be tricky for something like PPO; and I was curious if you have attempted any such thing, and if so if there were any issues you have ran into?

Why only lower half of snu_humanoid is used?

Is there some issues with convergence? Has someone tried to train full snu_humanoid?
At first glance it seems that unnatural running style could be caused by different mass distribution and other consequences of absence of upper torso, head and hands
Its not issue actually, I just very curious :-)

Anyway thanks for really great job done, its actually amazing!

Error when --render the humanoid

Bacis info my system is Ubuntu 20.8, GPU 3080, NVCC 11.6, gcc/g++ 7.5.0. Other setting is same as the env.

After I train the train_shac.py humanoid, I want to render it via ucd

My command is
python train_shac.py --cfg ./cfg/shac/humanoid.yaml --checkpoint ./logs/SNUHumanoid/shac/40/best_policy.pt --play --render
However, it cannot work for unexpected reason:

Using cached kernels
Setting seed: 0
~/anaconda3/envs/shac/lib/python3.8/site-packages/gym/spaces/box.py:127: UserWarning: WARN: Box bound precision lowered by casting to float32
logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
28 27
Start joint_q: [0.0, 1.35, 0.0, -0.7071067811865475, -0.0, -0.0, 0.7071067811865476, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
~DiffRL/dflex/dflex/model.py:1687: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1646755903507/work/torch/csrc/utils/tensor_new.cpp:210.)
m.shape_transform = torch.tensor(transform_flatten_list(self.shape_transform), dtype=torch.float32, device=adapter)
num_act = 21
num_envs = 1
num_actions = 21
num_obs = 76
Sequential(
(0): Linear(in_features=76, out_features=256, bias=True)
(1): ELU(alpha=1.0)
(2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
(3): Linear(in_features=256, out_features=128, bias=True)
(4): ELU(alpha=1.0)
(5): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
(6): Linear(in_features=128, out_features=21, bias=True)
(7): Identity()
)
Parameter containing:
tensor([-1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1., -1.,
-1., -1., -1., -1., -1., -1., -1.], device='cuda:0',
requires_grad=True)
Sequential(
(0): Linear(in_features=76, out_features=128, bias=True)
(1): ELU(alpha=1.0)
(2): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
(3): Linear(in_features=128, out_features=128, bias=True)
(4): ELU(alpha=1.0)
(5): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
(6): Linear(in_features=128, out_features=1, bias=True)
)
Traceback (most recent call last):
File "train_shac.py", line 114, in
traj_optimizer.play(cfg_train)
~/DiffRL/algorithms/shac.py", line 561, in play
self.run(cfg['params']['config']['player']['games_num'])
~/anaconda3/envs/shac/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
~/DiffRL/algorithms/shac.py", line 377, in run
mean_policy_loss, mean_policy_discounted_loss, mean_episode_length = self.evaluate_policy(num_games = num_games, deterministic = not self.stochastic_evaluation)
~/anaconda3/envs/shac/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
~/DiffRL/algorithms/shac.py", line 317, in evaluate_policy
obs = self.obs_rms.normalize(obs)
~/DiffRL/utils/running_mean_std.py", line 56, in normalize
result = (arr - self.mean) / torch.sqrt(self.var + 1e-5)
RuntimeError: The size of tensor a (76) must match the size of tensor b (53) at non-singleton dimension 1

Do you have any idea to fix that

Obtain gradient information explicitly

May I ask how to explicitly obtain gradient information in the environment, such as obtaining the gradient of s_{t+1} to s_t? Or is the information obtained using torch.autograd accurate? Thank you!

Backward is not reentrant Error

I tried to verify the gradients of env.step using torch.autograd.gradcheck. My current environment is ant. I get the following error:

raise GradcheckError('Backward is not reentrant, i.e., running backward with '
torch.autograd.gradcheck.GradcheckError: Backward is not reentrant, i.e., running backward with same input and grad_output multiple times gives different values, although analytical gradient matches numerical gradient.The tolerance for nondeterminism was 0.001.

Here is my code for the gradient test:

def test_grad(actions):

env.step(actions)
state = env.state
joint_q = state.joint_q
joint_qd = state.joint_qd
loss = torch.norm(joint_q)+torch.norm(joint_qd)

return loss

inputs = (actions)
test = torch.autograd.gradcheck(test_grad,inputs,nondet_tol=1e-3)

Is this behavior as expected? Thanks so much!

Add box obstacles

I want to try and add box obstacles to the environment, e.g. to create stairs that Ant or Cheetah has to scale. Is there a way to do this, perhaps by editing the ground plane?

I see in sim.py that there are some functions that handle contact forces. I'm not sure which ones would be relevant to modify to achieve the above, if any.

Brax implementation

Thought id mention; there is a brax implementation of SHAC here

I suppose its hard to compare directly since the envs are not identical, but if one of the original authors of SHAC can review it for apparent agreement with your algorithms as intended, thatd be super useful.

self.name error in SNU_humanoid.py

after successifully traning snu_humainoid , testing with rendering option gives an error about
self.name do not exist in snu_humanoid.py in , 78 line

my solution is just removing 'self.name'
And everything works fine

#self.stage = Usd.Stage.CreateNew("outputs/" + self.name + "HumanoidSNU_Low_" + str(self.num_envs) + ".usd") self.stage = Usd.Stage.CreateNew("outputs/" + "dd" + "HumanoidSNU_Low_" + str(self.num_envs) + ".usd")

[BUG] Initialization Velocities Scale with Distance from the Origin

@ViktorM

While writing a MuJoCo-Viewer Renderer for DiffRL, I noticed that initialization velocities appear to scale the farther out an actor is from the origin.

This behavior appears to occur in the original .usd's as well, so I don't believe it's an artifact of the visualization.

ExplodingVelocities.mp4

The early termination threshold also seems more sensitive far away from the origin.

I tested setting the stochastic initialization velocity to a constant and the velocities were still amplified far away from the origin, so I believe this is something deeper within the fundaments of dflex...

nvlabs / diffrl Goto Github PK

diffrl's Issues

How to set collision between non-connected dynamic bodies?

What is MM_caching_frequency?

Example Request - Cartpole/Ant using Warp instead of dFlex

Error when python test_env.py --env AntEnv

what should I do to set z-axis up

SHAC and policies for partial observability

Why only lower half of snu_humanoid is used?

Error when --render the humanoid

Obtain gradient information explicitly

Backward is not reentrant Error

Add box obstacles

Brax implementation

self.name error in SNU_humanoid.py

[BUG] Initialization Velocities Scale with Distance from the Origin

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent