nvidia / warp Goto Github PK
View Code? Open in Web Editor NEWA Python framework for high performance GPU simulation and graphics
Home Page: https://nvidia.github.io/warp/
License: Other
A Python framework for high performance GPU simulation and graphics
Home Page: https://nvidia.github.io/warp/
License: Other
Hi
Thank you for this awesome work!
just confirm: the default render using USD is not differentiable right?
if hope to use warp like gradSim, then need to replace the renderer with a differentiable renderer?
Thank you!
Hello!
Would it be possible to provide a comparison between this project and taichi?
For starters, here's the example script for both projects.
Warp
import warp as wp
import numpy as np
wp.init()
num_points = 1024
device = "cuda"
@wp.kernel
def length(points: wp.array(dtype=wp.vec3),
lengths: wp.array(dtype=float)):
# thread index
tid = wp.tid()
# compute distance of each point from origin
lengths[tid] = wp.length(points[tid])
# allocate an array of 3d points
points = wp.array(np.random.rand(num_points, 3), dtype=wp.vec3, device=device)
lengths = wp.zeros(num_points, dtype=float, device=device)
# launch kernel
wp.launch(kernel=length,
dim=len(points),
inputs=[points, lengths],
device=device)
print(lengths)
Taichi
# python/taichi/examples/simulation/fractal.py
import taichi as ti
ti.init(arch=ti.gpu)
n = 320
pixels = ti.field(dtype=float, shape=(n * 2, n))
@ti.func
def complex_sqr(z):
return ti.Vector([z[0]**2 - z[1]**2, z[1] * z[0] * 2])
@ti.kernel
def paint(t: float):
for i, j in pixels: # Parallelized over all pixels
c = ti.Vector([-0.8, ti.cos(t) * 0.2])
z = ti.Vector([i / n - 1, j / n - 0.5]) * 2
iterations = 0
while z.norm() < 20 and iterations < 50:
z = complex_sqr(z) + c
iterations += 1
pixels[i, j] = 1 - iterations * 0.02
gui = ti.GUI("Julia Set", res=(n * 2, n))
for i in range(1000000):
paint(i * 0.03)
gui.set_image(pixels)
gui.show()
I am trying to use svd3, here is a small example
`import warp as wp
wp.init()
x = wp.diag(wp.vec3(1., 2., 3.))
u = wp.diag(wp.vec3(0., 0., 0.))
v = wp.diag(wp.vec3(0., 0., 0.))
s = wp.vec3(0., 0., 0.)
wp.svd3(x, u, s, v)`
I get this error message when running the code
File "/home/kasra/Postdoc/projects/auto_design/code/alaki3.py", line 12, in <module> wp.svd3(x, u, s, v) File "/home/kasra/anaconda3/envs/auto_design/lib/python3.8/site-packages/warp/context.py", line 189, in __call__ raise error File "/home/kasra/anaconda3/envs/auto_design/lib/python3.8/site-packages/warp/context.py", line 163, in __call__ value_type = type_ctype(f.value_func(None)) File "/home/kasra/anaconda3/envs/auto_design/lib/python3.8/site-packages/warp/context.py", line 155, in type_ctype elif issubclass(dtype, ctypes.Array): TypeError: issubclass() arg 1 must be a class
Just a heads up there seems to be some trouble getting warp running on colab. Installation itself can be patched through this
!export CUDA_HOME=/usr/local/cuda-11.1
!git clone https://github.com/NVIDIA/warp
%cd /content/warp
!python build_lib.py --cuda_path /usr/local/cuda-11.1
!pip install -e .
But there still seems to be an issue with running kernels that call an external function (wp.func
), which I'm trying to track down.
Hello I have relatively simple warp kernel that I am trying to use as a loss function in computer vision
All works most of the time but in some iterations I get
Failed to lookup kernel function
on the same data in the same settings the same function call sometimes works and sometimes do not, ussually it is not the first call so library is properly loaded and warp.init called
Additionally for debugging purposes the kernel definition is put in the same file as the function that is using it
Just installed this for fun on a computer without a GPU and found a bug, maybe easy to fix?
>>> import warp as wp
>>> wp.is_cuda_available()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/dist-packages/warp/context.py", line 871, in is_cuda_available
return runtime.cuda_device != None
AttributeError: 'NoneType' object has no attribute 'cuda_device'
Hi,
I am very interested in what warp seems to do and I wanted to test it on my local desktop.
So I tried the installation procedure mentioned in the main page and tried to run the tests.
I get the following output which indicate their is an error and CUDA is not recognized.
python -m warp.tests
Warp CUDA error: Could not open libcuda.so.
Warp 0.7.2 initialized:
CUDA not available
Devices:
"cpu" | x86_64
Kernel cache: /home/rcremese/.cache/warp/0.7.2
Skipping Torch tests due to exception: No module named 'torch'
Skipping Torch DLPack tests due to exception: No module named 'torch'
Skipping Jax DLPack tests due to exception: No module named 'jax'
test_volume_allocation_f (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_allocation_i (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_allocation_v (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_introspection (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_sample_linear_f_gradient (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_sample_linear_v_gradient (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_store (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_volume_transform_gradient (warp.tests.test_volume.register.<locals>.TestVolumes) ... ok
test_func_export_cpu (warp.tests.test_func.register.<locals>.TestFunc) ... ok
test_addition_float16 (warp.tests.test_quat.register.<locals>.TestQuat) ... ERROR
======================================================================
ERROR: test_addition_float16 (warp.tests.test_quat.register.<locals>.TestQuat)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/rcremese/mambaforge/envs/physic-env/lib/python3.9/site-packages/warp/tests/test_base.py", line 144, in test_func
func(self, device, **kwargs)
File "/home/rcremese/mambaforge/envs/physic-env/lib/python3.9/site-packages/warp/tests/test_quat.py", line 395, in test_addition
wp.launch(kernel, dim=1, inputs=[q,v,], outputs=[r0,r1,r2,r3], device=device)
File "/home/rcremese/mambaforge/envs/physic-env/lib/python3.9/site-packages/warp/context.py", line 2091, in launch
success = kernel.module.load(device)
File "/home/rcremese/mambaforge/envs/physic-env/lib/python3.9/site-packages/warp/context.py", line 863, in load
raise RuntimeError("Failed to build CPU module because no CPU buildchain was found")
RuntimeError: Failed to build CPU module because no CPU buildchain was found
----------------------------------------------------------------------
Ran 10 tests in 0.006s
FAILED (errors=1)
I guess this is due to the fact that I test it on a Windows WLS2 with an Ubuntu 20.04 distro. Nevertherless, I installed the CUDA toolkit on my Windows machine as explained here and checked that CUDA is recognized in WSL (cf. below).
nvidia-smi
Tue Mar 7 14:29:45 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 531.14 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2070 w... On | 00000000:01:00.0 Off | N/A |
| N/A 41C P5 12W / N/A| 0MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Therefore, I wonder where the problem might come from and if it's possible to use warp on Windows WSL.
Thank you.
The use of import *
is helpful in some cases, but it makes the codebase extremely difficult to follow and debug. For example, the recent issue for urdf #5 (comment) seems to be a problem with the Mesh class. When following the trace in a local copy of the package, it turns out it calls the Mesh
class. But it seems like it should call the Mesh
class... obviously this is confusing...
I only found the problem because after wondering why the next error (after fixing the previous one) results in 'Mesh' object has no attribute 'finalize'
, even though its in the same file and clearly has that function.
I suspect import *
is causing a silent problem with import resolution (not sure why the tests don't see this).
In parse_urdf()
there is a call to urdfpy.urdf.load
, I think this is supposed to be urdfpy.URDF.load
.
There is also builder.add_link
but this gives the error AttributeError: 'ModelBuilder' object has no attribute 'add_link'
, which I guess is true since I don't see it.
Hi, I'm trying to execute this sample
https://github.com/matthias-research/pages/blob/master/tenMinutePhysics/16-GPUCloth.py
However, I wasn't able to run it properly. In particular, I get the following message.
File "c:\Users\Alex\Desktop\16-GPUCloth.py", line 875, in <module>
z = wp.sub(x, y)
AttributeError: module 'warp' has no attribute 'sub'
What is the best way to compute the Jacobian
Is it possible to build Warp on MacOS M1?
Thanks!
Material parameters of mesh like mu lamba are all floats type. Can I replace them with wp array and set them requires_grad?
I think there are some nuances to this issue and this is my best guess.
As the title stands, I could see an argument for saying its not a bug. However, there should be a better error. Or instead, when creating the mesh, find a way to check if it is invalid upon creation.
even though I turn on debug mode and breakpoints in vscode, the only thing I can see is access violoation on the original launch. it was by trial and error and commenting out things until I got to where it seems to happen.
I am trying to run an existing piece of code on Warp, but got different results on warp version 0.2.0 and 0.7.2. The code I'm trying to run is a two-ball collision scenario.
Here is the results from Warp 0.2.0. As you can see, warp outputs reasonable gradients on dl/dx0, dl/dv0, and dl/du0 (The full description of those items can be found at the the original repo.
Warp initialized:
Version: 0.2.0
Using CUDA device: NVIDIA GeForce RTX 3090
Using CPU compiler: /usr/bin/g++
Module utils.customized_integrator_xpbd load took 432.12 ms
Module _two_balls_1_warp load took 159.95 ms
------------Task 3: Position-based Dynamics (Warp)-----------
loss: [2.0605943]
gradient of loss w.r.t. initial position dl/dx0: [[-0.3609619 -0.3609619]
[-0.3609619 -0.3609619]]
gradient of loss w.r.t. initial velocity dl/dv0: [[-0.47226432 -0.47226432]
[-0.24966814 -0.24966814]]
gradient of loss w.r.t. initial ctrl dl/du0: [[ 0.00026612 0.00026612]
[-0.00052014 -0.00052014]]
---------start training------------
^C
On warp 0.7.2, although the forwarding results are the same (the loss is still 2.06), the computed gradients are totally different:
Warp CUDA error: Could not open libcuda.so.
Warp 0.7.2 initialized:
CUDA not available
Devices:
"cpu" | x86_64
Kernel cache: /home/user/.cache/warp/0.7.2
Module utils.customized_integrator_xpbd load on device 'cpu' took 823.64 ms
Module _two_balls_1_warp load on device 'cpu' took 263.55 ms
------------Task 3: Position-based Dynamics (Warp)-----------
loss: [2.0605943]
gradient of loss w.r.t. initial position dl/dx0: [[-2. -2. ]
[-1.25 -1.25]]
gradient of loss w.r.t. initial velocity dl/dv0: [[-23400.346 -23400.346]
[ -4669.934 -4669.934]]
gradient of loss w.r.t. initial ctrl dl/du0: [[-48.746975 -48.746975]
[ -9.72903 -9.72903 ]]
---------start training------------
^C
OS: Ubuntu 18.04 LTS
CPU: AMD Ryzen Threadripper 3970X 32-Core Processor
GPU: NVIDIA RTX 3090
CUDA: 11.5
First, clone the repository https://github.com/DesmondZhong/diff_sim_grads
Next, create a conda environment with python 3.9.
For installing warp 0.2.0, run:
pip install -r requirements_freeze.txt
For installing warp 0.7.2, run:
pip install warp-lang usd-core omegaconf matplotlib
After installing necessary packages, run the following commands to see the difference in warp 0.2.0 and warp 0.7.2 environments.
export PYTHONPATH=[abs_path_to_diff_sim_grads]
cd diff_sim_grads/task3_two_balls
python two_balls_1_pbd_warp.py
, where [abs_path_to_diff_sim_grads]
is the absolute path to the clone repo.
In a simple kernel that multiplies an array of matrices with an array of vectors I get a CUDA compilation error when using an array of mat44 and arrays of vec4
error: no suitable constructor exists to convert from "void" to "wp::mat44"
This does not happen if I define and launch a similar kernel with mat33 and vec3
I have attached both the working 3x3 and the non-working 4x4 Python samples along with the full error text
To replicate simply run
python test_works.py
and
python test.py
The last one gives the error.
extra info:
Warp Version: 0.1.25 Using CUDA device: NVIDIA GeForce GTX 1080
NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5
Hi there!
Thanks for making this library available - I am curious if there is a way to do reductions, for example I want to find the nearest point to a line (distance + index), for that some kind of reduction seems necessary.
Is this possible at the moment? I am presuming a for-loop is not the correct attempt, but from what I can tell the only reductions are additions or subtractions currently?
Thanks,
Oliver
Is it possible to inspect the code that was generated for user kernels (for both forward and backward pass)?
Is there a supported approach to converting a warp array to a cupy array, ideally without copying?
Is it possible to pass string type arguments as inputs in wp.launch()? Warp throws an error when I try to do it.
"RuntimeError: Error launching kernel, unable to pack kernel parameter type <class 'str'> for param bc, expected <class 'str'>"
Hi!
I've installed warp with pip: pip install warp-lang
. After I try to execute
import warp as wp
wp.init()
I get a FileNotFoundError:
Traceback (most recent call last):
File ".\examples\example_sim_ant.py", line 26, in <module>
wp.init()
File "C:\Users\mariako\anaconda3\envs\warp\lib\site-packages\warp\context.py", line 2173, in init
runtime = Runtime()
File "C:\Users\mariako\anaconda3\envs\warp\lib\site-packages\warp\context.py", line 1050, in __init__
self.core = warp.build.load_dll(os.path.join(bin_path, warp_lib))
File "C:\Users\mariako\anaconda3\envs\warp\lib\site-packages\warp\build.py", line 323, in load_dll
dll = ctypes.CDLL(dll_path, winmode=0)
File "C:\Users\mariako\anaconda3\envs\warp\lib\ctypes\__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'C:\Users\mariako\anaconda3\envs\warp\Lib\site-packages\warp\bin\warp.dll' (or one of its dependencies). Try using the full path with constructor syntax.
However, the dll file is on it's place
I checked warp.dll
dependencies with https://github.com/lucasg/Dependencies tool, and some entries looks suspicious
However, I'm sure what to do from here. May I ask to take a look
I'm using Windows 11, python 3.8 in conda environment
I have used warp to create a custom pytorch operator and it works great, but deallocating and reallocating memory not only takes a lot of time, but the timing of the GC effects the amount of memory allocated. Is there a way to allow warp to reuse allocated memory, especially when the buffers are the same size, besides writing my own allocator?
Hello! I've got a question: Have you considered making wp.array
a Generic type, rather than passing the arguments to the constructor in type annotations?
For example, from this:
@wp.kernel
def apply_forces(grid : wp.uint64,
particle_x: wp.array(dtype=wp.vec3),
particle_v: wp.array(dtype=wp.vec3),
particle_f: wp.array(dtype=wp.vec3),
radius: float,
k_contact: float,
k_damp: float,
k_friction: float,
k_mu: float):
...
to this:
@wp.kernel
def apply_forces(grid : wp.uint64,
particle_x: wp.array[wp.vec3],
particle_v: wp.array[wp.vec3],
particle_f: wp.array[wp.vec3],
radius: float,
k_contact: float,
k_damp: float,
k_friction: float,
k_mu: float):
...
This would have the following benefits:
I assume you're using something like typing.get_type_hints
or the __annotations__
dict directly in wp.kernel
to extract the type annotations from the function, correct?
With a generic wp.array
type, the dtype can still be easily be recovered using typing.get_args
on the annotation.
Let me know what you think!
Do you plan to support second order derivatives? Or there is already a way to do these types of computations.
import numpy as np
import math
import torch
import warp as wp
import warp.torch
device = "cuda"
wp.init()
@wp.kernel
def test_kernel(
x : wp.array(dtype=float),
y : wp.array(dtype=float)):
tid = wp.tid()
y[tid] = 0.5 - x[tid]*x[tid]*2.0
# define PyTorch autograd op to wrap simulate func
class TestFunc(torch.autograd.Function):
@staticmethod
def forward(ctx, x):
# allocate output array
y = torch.empty_like(x)
ctx.x = x
ctx.y = y
wp.launch(
kernel=test_kernel,
dim=len(x),
inputs=[wp.torch.from_torch(x)],
outputs=[wp.torch.from_torch(y)],
device=device)
return y
@staticmethod
def backward(ctx, adj_y):
# adjoints should be allocated as zero initialized
adj_x = torch.zeros_like(ctx.x).contiguous()
adj_y = adj_y.contiguous()
wp.launch(
kernel=test_kernel,
dim=len(ctx.x),
# fwd inputs
inputs=[wp.torch.from_torch(ctx.x)],
outputs=[None],
# adj inputs
adj_inputs=[wp.torch.from_torch(adj_x)],
adj_outputs=[wp.torch.from_torch(adj_y)],
device=device,
adjoint=True)
return adj_x
# define PyTorch autograd op to wrap simulate func
class TestFuncTorch(torch.autograd.Function):
@staticmethod
def forward(ctx, x):
ctx.x = x
y = 0.5 - 2. * x**2
return y
@staticmethod
def backward(ctx, adj_y):
return adj_y * (- 4. * x)
# input data
x = 2.0 * torch.ones(16, dtype=torch.float32, device=device, requires_grad=True).contiguous()
# Pure torch
y = TestFuncTorch.apply(x)
dydx = torch.autograd.grad(y.sum(), x, retain_graph=True, create_graph=True)[0]
print(dydx )
d2ydx2 = torch.autograd.grad(dydx.sum(), x, retain_graph=True, create_graph=True)[0]
print(d2ydx2)
# execute op
y = TestFunc.apply(x)
dydx = torch.autograd.grad(y.sum(), x, retain_graph=True, create_graph=True)[0]
print(dydx )
try:
d2ydx2 = torch.autograd.grad(dydx.sum(), x, retain_graph=True, create_graph=True)[0]
print(d2ydx2)
except:
raise ValueError("No gradient!")
Hi all,
On Ubuntu 20.04, Anaconda python 3.9.7, cudatoolkit 11.3. I followed the installation in the guide and I am getting the following error. Why is this happening?
$ python examples/example_raycast.py
Traceback (most recent call last):
File "/home/user/scratch_space/warp/warp/examples/example_raycast.py", line 23, in <module>
wp.init()
File "/home/user/scratch_space/warp/warp/warp/context.py", line 1097, in init
runtime = Runtime()
File "/home/user/scratch_space/warp/warp/warp/context.py", line 470, in __init__
self.core = warp.build.load_dll(warp_lib)
File "/home/user/scratch_space/warp/warp/warp/build.py", line 285, in load_dll
dll = ctypes.CDLL(dll_path)
File "/home/user/anaconda3/envs/warp/lib/python3.9/ctypes/__init__.py", line 382, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/user/scratch_space/warp/warp/warp/bin/warp.so: undefined symbol: _ZN2wp24hash_grid_rebuild_deviceERKNS_8HashGridEPKNS_4vec3Ei
This looks like an interesting library, but after reviewing the license I realized it says it's for non-commercial use only. I think that information should be made more clear so people don't invest time unless they are comfortable with that limitation. (Of course an even better solution would be to change the license to something like Apache 2, in case you are open to that option.)
How is the user supposed to implement functionality with two levels of parallelism, for example matrix—matrix multiplication?
My first thought was to implement two kernels, where one performs matrix—vector multiplication and the second one calls the first kernel for each column of the second matrix. I know that the following could be done via multi-dimensional grid bounds
, but just for a sake of it.
E.g. for matrixes and vectors of size 3:
@wp.func
def row_mult(row: wp.array(dtype=float), v: wp.array(dtype=float)):
return row[0]*v[0] + row[1]*v[1] + row[2]*v[2]
@wp.kernel
def mat_vec_mul(a: wp.array(dtype=float, ndim=2), b: wp.array(dtype=float), c: wp.array(dtype=float)):
id = wp.tid()
c[id] = row_mult(a[id], b)
@wp.kernel
def mat_mat_mul(a: wp.array(dtype=float, ndim=2), b: wp.array(dtype=float, ndim=2), c: wp.array(dtype=float, ndim=2)):
id = wp.tid()
wp.launch(kernel=mat_vec_mul, dim=3, inputs=[a, b[id]], outputs=[c[id]], device="cuda")
# calling
dim = 15
a = wp.array(np.arange(9).reshape((3,3)), dtype=float, device="cuda")
c = wp.array(0*np.eye(dim, dtype=float), dtype=float, device="cuda")
b = wp.array(np.eye(dim, dtype=float), dtype=float, device="cuda")
wp.launch(kernel=mat_mat_mul, dim=dim, inputs=[a, b], outputs=[c], device="cuda")
print(c)
But apparently this causes RuntimeError: Could not find function wp.launch as a built-in or user-defined function. Note that user functions must be annotated with a @wp.func decorator to be called from a kernel.
Any workaround for this?
My use case is to compute Jacobins of an unrolled simulation trajectory, i.e.
So far it seems structs are supported as arguments to kennels but cannot be used in functions. It would be nice to be able to add support for structs in functions for both arguments and return values. This should be considered low priority. I put it here in case others find the same issue I did.
After building from source, I encounter the following error when running tests.
FileNotFoundError: Could not find module 'warp.dll' (or one of its dependencies). Try using the full path with constructor syntax.
After spending some time on the problem, I figured out that this is due to a bug of Python 3.8 on Windows.
This stackoverflow post pointed out the issue.
https://stackoverflow.com/questions/59330863/cant-import-dll-module-in-python/64472088#64472088
The author of the above post has reported this bug and it has been fixed for future python versions.
python/cpython#86280
In short, the solution is to explicitly pass winmode=0
to ctypes.CDLL
. In Python 3.8, the default value of winmode
is set to None
, which is unintended as it is not consistent with the documentation. This argument only affects the loading logic under Windows.
Besides this change, I also need to pass the full path of the file warp.dll
instead of just the filename in order to successfully load the file. This is suggested in the error message above.
I'll be creating a pull request to fix this issue soon. I think python 3.8 users on Windows will encounter this problem and will benefit from the fix.
My environment
Seems to be a bug with wp.pow
for negative values:
@wp.kernel
def test(tt: wp.array(dtype=wp.float32)):
tt[0] = wp.pow(-1., 2.)
tt = wp.zeros((1,), dtype=wp.float32, device='cuda')
wp.launch(test, dim=1, inputs=[tt], device='cuda')
print(tt)
The above prints [nan]
while it should print [-1]
. Could be an issue with my graphics card who knows but if not then it's probably a bug somewhere. (I'm getting the correct value with device='cpu'
.)
I have a @wp.func
that is part of a larger call history. However, the part that it is failing at is:
@wp.func
def calc__force(rr_i: wp.vec3,
ri: float,
vv_i: wp.vec3,
pn_rr: wp.array(dtype=wp.vec3),
pn_vv: wp.array(dtype=wp.vec3),
pn_r: wp.array(dtype=float),
):
force = wp.zeros_like(rr_i)
Resulting in:
(The top part of the error, with warp being properly initialized)
Warp initialized:
Version: 0.3.1
CUDA device: NVIDIA RTX A4500
Kernel cache: C:\Users\local user\AppData\Local\NVIDIA Corporation\warp\Cache\0.3.1
Error: Could not find function wp.zeros_like as a built-in or user-defined function. Note that user functions must be annotated with a @wp.func decorator to be called from a kernel. while transforming node <class '_ast.Call'> in func: calc_agent_force at line: 11 col: 12:
force = wp.zeros_like(rr_i)
and then the same function is being mentioned at the bottom of the error
....\lib\site-packages\warp\codegen.py", line 1054, in eval
raise RuntimeError(f"Could not find function {'.'.join(path)} as a built-in or user-defined function. Note that user functions must be annotated with a @wp.func decorator to be called from a kernel.")
RuntimeError: Could not find function wp.zeros_like as a built-in or user-defined function. Note that user functions must be annotated with a @wp.func decorator to be called from a kernel.
Based on the docs, this should be working. However, I assume there is something else wrong, somewhere, and this error is being incorrectly displayed.
currently the types available are:
scalar_types = [int8, uint8, int16, uint16, int32, uint32, int64, uint64, float32, float64]
vector_types = [vec2, vec3, vec4, mat22, mat33, mat44, quat, transform, spatial_vector, spatial_matrix]
there doesn't seem to be a bool type?
My specific use-case is to implement something like the activeIndices from Nvidia Flex that activates / deactivates particles on the GPU while still keeping them in memory.
I'm a fresh learner for CUDA and was getting familiar with C++ CUDA API. This repo looks interesting! I'm wondering will it later support all the C++ CUDA API completely, by using only python to implement any kind of CUDA kernels?
From example_dem.py:
def update(self):
with wp.ScopedTimer("simulate", active=True):
if (self.use_graph):
with wp.ScopedTimer("grid build", active=False):
self.grid.build(self.x, self.grid_cell_size)
with wp.ScopedTimer("solve", active=False):
wp.capture_launch(self.graph)
wp.synchronize()
else:
for s in range(self.sim_substeps):
with wp.ScopedTimer("grid build", active=False):
self.grid.build(self.x, self.point_radius)
with wp.ScopedTimer("forces", active=False):
wp.launch(kernel=apply_forces, dim=len(self.x), inputs=[self.grid.id, self.x, self.v, self.f, self.point_radius, self.k_contact, self.k_damp, self.k_friction, self.k_mu], device=self.device)
wp.launch(kernel=integrate, dim=len(self.x), inputs=[self.x, self.v, self.f, (0.0, -9.8, 0.0), self.sim_dt, self.inv_mass], device=self.device)
wp.synchronize()
The captured kernels are defined as:
if (self.use_graph):
wp.capture_begin()
for s in range(self.sim_substeps):
with wp.ScopedTimer("forces", active=False):
wp.launch(kernel=apply_forces, dim=len(self.x), inputs=[self.grid.id, self.x, self.v, self.f, self.point_radius, self.k_contact, self.k_damp, self.k_friction, self.k_mu], device=self.device)
wp.launch(kernel=integrate, dim=len(self.x), inputs=[self.x, self.v, self.f, (0.0, -9.8, 0.0), self.sim_dt, self.inv_mass], device=self.device)
self.graph = wp.capture_end()
If use_graph==True, grid.build() is called once, then the kernels are replayed 64 times. If use_graph==False, grid.build() is called every time the kernels are called. I think this is a bug.
I tried pip install warp as described in this blog post but even after it says it was installed I get this:
ubuntu@ip-172-31-31-7:/mnt/extra/pieper/SlicerDMRI$ pip3 install warp
Collecting warp
Using cached warp-0.0.1-py3-none-any.whl (1.1 kB)
Installing collected packages: warp
Successfully installed warp-0.0.1
ubuntu@ip-172-31-31-7:/mnt/extra/pieper/SlicerDMRI$ python3
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import warp
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'warp'
This is ubuntu 20.04 with the standard python and pip packages.
It would be nice to have support for the Apple M1, at least for the CPU device, to make warp
code portable to more environments, in my case to use warp
for teaching university courses. When I tried to run a simple test on the M1, the execution breaks with the error
OSError: dlopen(/opt/homebrew/lib/python3.9/site-packages/warp/bin/warp.dylib, 0x0006): tried: '/opt/homebrew/lib/python3.9/site-packages/warp/bin/warp.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
I am sure sure if this is due to my installation or simply that warp
does not get packaged with armv8
support.
When running examples, the errors show that I can't open USD files. I've installed usd-core
.
(warp) xx@xx:~/code/warp$ python examples/example_mesh.py
Warp initialized:
Version: 0.1.25
Using CUDA device: GeForce RTX 2060
Using CPU compiler: /usr/lib/ccache/g++
Traceback (most recent call last):
File "example_mesh.py", line 103, in <module>
usd_stage = Usd.Stage.Open(os.path.join(os.path.dirname(__file__), "assets/bunny.usd"))
pxr.Tf.ErrorException:
Error in 'pxrInternal_v0_22__pxrReserved__::UsdStage::Open' at line 878 in file /opt/USD/pxr/usd/usd/stage.cpp : 'Failed to open layer @assets/bunny.usd@'
Hello there!
Is the method implemented in WARP for testing if a point is inside a closed mesh correct? I am seeing that it checks the dot product of the difference vector between the query point and the closest point on the mesh with the triangle normal to see if is inside. However, e.g. in this paper they say that always using the triangle normal is incorrect if the closest point is on a vertex or an edge of the mesh. Paper: "Generating Signed Distance Fields From Triangle Meshes". Instead they use vertex and edge (pseudo) normals to get the correct sign of the distance.
I got this error message when running "python -m warp.tests":
nvrtc: error: failed to open nvrtc-builtins64_113.dll
environment: Win 10 x64; CUDA toolkit 11.5; Python 3.8.7; Nvidia graphics driver 496.13; Visual Studio 2019
steps to reproduce: 1. download and unzip v0.1.25-alpha; 2. "pip install ." ; 3. "pip install usd-core" 4. "python -m warp.tests". 5. error.
The .dll file in question is found in both the warp project directory, and also in the python install directory:
"C:\Users<my username>\AppData\Local\Programs\Python\Python38\Lib\site-packages\warp\bin\nvrtc-builtins64_113.dll"
So maybe it's just a pathing issue - any pointers?
There seems to be some missing arguments in the docstring
Line 652 in fc7d325
Thank you for this cool library!
I noticed a small issue in the function ModelBuilder.add_soft_mesh
in warp/sim/model.py, line 1787 :
p = wp.quat_rotate(rot, v * scale) + pos
where pos is List[float]
. Seems that addition between wp.vec3
and List[float]
is not supported and this line should be
p = np.array(wp.quat_rotate(rot, v * scale)) + pos
according to other lines in the same file.
Thank you!
Hey there, just installed the repo, and did the following:
$ conda create -n warp python=3.9
...
$ conda activate warp
$ pip install -e .
...
$ python examples/example_dem.py
Traceback (most recent call last):
File "/home/fabrice/Source/warp/examples/example_dem.py", line 24, in <module>
wp.init()
File "/home/fabrice/Source/warp/warp/context.py", line 1097, in init
runtime = Runtime()
File "/home/fabrice/Source/warp/warp/context.py", line 470, in __init__
self.core = warp.build.load_dll(warp_lib)
File "/home/fabrice/Source/warp/warp/build.py", line 285, in load_dll
dll = ctypes.CDLL(dll_path)
File "/home/fabrice/miniconda3/envs/warp/lib/python3.9/ctypes/__init__.py", line 382, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/fabrice/Source/warp/warp/bin/warp.so: undefined symbol: _ZN2wp24hash_grid_rebuild_deviceERKNS_8HashGridEPKNS_4vec3Ei
So then I tried:
$ python build_lib.py
Namespace(msvc_path=None, sdk_path=None, cuda_path=None, mode='release', verbose=True)
Warning: building without CUDA support
Building /home/fabrice/Source/warp/warp/bin/warp.so
g++ -O3 -DNDEBUG -DWP_CPU -fPIC --std=c++11 -c "/home/fabrice/Source/warp/warp/native/warp.cpp" -o "/home/fabrice/Source/warp/warp/native/warp.cpp.o"
build took 1178.73 ms
g++ -shared -Wl,-rpath,'$ORIGIN' -o '/home/fabrice/Source/warp/warp/bin/warp.so' "/home/fabrice/Source/warp/warp/native/warp.cpp.o"
link took 72.52 ms
$ python examples/example_dem.py
Traceback (most recent call last):
File "/home/fabrice/Source/warp/examples/example_dem.py", line 24, in <module>
wp.init()
File "/home/fabrice/Source/warp/warp/context.py", line 1097, in init
runtime = Runtime()
File "/home/fabrice/Source/warp/warp/context.py", line 470, in __init__
self.core = warp.build.load_dll(warp_lib)
File "/home/fabrice/Source/warp/warp/build.py", line 285, in load_dll
dll = ctypes.CDLL(dll_path)
File "/home/fabrice/miniconda3/envs/warp/lib/python3.9/ctypes/__init__.py", line 382, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/fabrice/Source/warp/warp/bin/warp.so: undefined symbol: _ZN2wp24hash_grid_rebuild_deviceERKNS_8HashGridEPKNS_4vec3Ei
(No change).
OS: Ubuntu 20.04 LTS
$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
I'm probably doing something wrong, and I didn't read all the README/documentation thoroughly. But I thought I'd post this, just in case this information might be useful.
Cheers!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.