owl-project / owl Goto Github PK
View Code? Open in Web Editor NEWHome Page: http://owl-project.github.io
License: Apache License 2.0
Home Page: http://owl-project.github.io
License: Apache License 2.0
we already have a sample that uses transforms (sierpinski), but that one only uses translations that doesn't affect normals, and doesn't need world-space vs object-space positions (and thus, doesn't use/test any of the point/vector/normals transforms required for 'real' transform matrices.
Would be nice to have an additional sample that used more complex transforms and also used/tested/demonstrated the optix world-object space transforms in the device code. This will likely require also inovlve adding some convenience funtions to the ll/deviceAPI.h.
Not critical, but would be nice to have.
In particular on windows, htere's a default location for the SDK - should check for that if OptiX_INSTALL_DIR isn't set.
owlSet is currnetly only implemented for 3f, 2i, etc, as well as Buffer ... but not for raw pointers.
Workardoun is to set the pointer as a 1ul variable (which works just fine), but htat's ugly.
dimension of cuda launch for coputing bounds of a usergeom is currently computed as follows:
uint32_t boundsFuncBlockSize = 128;
uint32_t numPrims = (uint32_t)ug->numPrims;
vec3i blockDims(owl::common::divRoundUp(numPrims,boundsFuncBlockSize),1,1);
vec3i gridDims(boundsFuncBlockSize,1,1);
This seems to work just fine so far BUT for large numers of prims has two (potential) issues:
a) the 'divRoundUp' on uint types may get overflows
b) the grid Dimensions in x may be too big for what cuda allows (64K on some cards).
May have to go to setting both x and y; then also need to fix the bounds call itself to adjust for that.
It would be awesome if owl could support DDS file format :)
currently need to write vec3f x = vec3f(vec4f(f4array[idx])) .... fix this.
Right now the bound kernel for all prims in one geom are run in parallel, but different geoms in a group are run sequentially (so only one kernel is in flight at any time). This works great when all prims are in few geoms, but creates giant built times for models with large numbers of independent geometries (ie, when each stream line is its own geom).
... in readme.me.
Note: make sure optix 7 SDK and TBB are handled, in particular for windows
without this performance is bad when using multi-device
When using the high level OWL api, it's currently unclear how to implement the OptiX denoiser, as users do not have direct access to the optix context required for optixDenoiserCreate
.
I could see adding support for the denoiser in one of two ways. Either 1, we could add some function which returns the current optix context handle to the end user. Or 2, we could implement the optix denoiser within OWL and define some abstraction API somehow.
When I add_subdirectory(owl), I was finding that my RelWithDebug and other build types disappeared. It took me a while to find, but it seems that the configure_build_type.cmake is sneakily deleting them.
This was a pretty subtle issue I had a hard time fixing. I'd propose we remove that configure file, since OWL shouldn't be in control of parent project CMAKE_CONFIGURATION_TYPES or CMAKE_BUILD_TYPE.
peer access is required for nvlink; should enable by default.
ideally have c99 version of at least one example; minimal solution is to at least include the header file, create a context, and exit.
Currently, OWL does not support lists of buffers or lists of textures. Users instead must extract device pointers from OWLBuffers on a device by device basis, uploading these pointers to an OWLBuffer that's device-specific.
This usually results in applications sacrificing multi-GPU during prototyping, and it can quickly become too difficult to maintain multi-GPU compatibility long term. However, these lists of buffers and lists of texture handles are really important, eg for scenes with emissive triangles and/or textures accessed outside the closest hit.
At the same time, arbitrary buffers of pointers complicate matters under the hood for OWL. Buffers of buffer pointers might result in some pointers in that buffer to themselves point to buffers of pointers. Yuck!
Instead, I'd propose we add support for 1D groups of buffer pointers and groups of textures. These groups would act like flattened lists, with no possible multi-dimensional indirection. This way, OWL could under the hood create N flattened item lists, one per device, each flattened list containing only items allocated on the corresponding device.
If we could get that working, then it would be much easier to for OWL end users to maintain multiGPU compatibility. Later down the line, OWLBufferGroups and OWLTextureGroups could additionally contain hints about NVLink stuff.
at the very least, specify a flag that allows using to enable this.
all gdt:: namespace and gdt/ include dirs to be replaced with owl-common; purge no-longer-needed components. Goal is for apps to eb able to use both gdt and owl without any naming or namespace conflicts.
Optix comes with certani limits in num prims per GAS, or num insts per IAS; going over these limits apparently triggers errors in debug mode, but apparently just creates wrong results in release mode. Should add test for those values, and trigger owl error if inputs execeed these limits.
We already have a sample of instancing for the ll layer (sample 07 sierpinski), but none yet for the node graph layer.
TODO: take samples/ll/s07-sierpinski, and make ng version in samples/ng/s07-sierpinsiki. All requires API functionality should already be there, and devicecode should be exactly the same, but needs to get done.
right now we only have host pinned buffer (always on host) and devicebuffer (always on device).
for large models, would be good to have managedmemory as well - in theory that shoudl allow paging.
right now miss progs always get used in the order they are created; better would be to let user set that explicitly.
needed to replicate HPG 2019 RTX tetmesh point query functionality
Just committed "hotfix" for missing owlInstanceGroupCreate() and owlInstanceGrouPSetTransform() to unblock project that needs those... .but that functionality isn't clean enogh, yet, and needs work.
Notes for now:
More of a feature request than an issue.
I'm trying to convert my owlDeviceBuffers into owlManagedMemoryBuffers, to enable device-side memory oversubscription. However, since owlBufferUpload isn't implemented for owlManagedMemoryBuffers, I need to refactor my code to instead memcpy into a pointer through owlBufferGetPointer(). owlBufferGetPointer then wants me to specify which device the buffer belongs to, but managed memory and host memory don't belong to a particular device, so this is somewhat confusing.
If owl were to implement the same owlBufferUpload API for both managed memory buffers and host mapped buffers, it would be easier to refactor code to make the switch back and forth.
we have owlParamsLaunch, but owlLaunchParamsSet ... make consistent.
most types have relaese functions, module does not.
not critical - contextdestroy already realeses module - but nice to have.
TBB works on windows, so this shouldn't be forced to false on Win32. Maybe this could be changed to an optional dependency for all platforms?
instance groups dont' free the temp memory they use for building the instances; leading to a mmeory leak if those instances get rebuilt every frame.
probably best solution: make destructor of DeviceMemory free the memory it allocated.
reproducer: see iw/group-rebuild-test branch.
Current API allows user to specify max instancing depth, and then sets optix up in the right way ... but does not actually check whether the created graph of group nodes adheres to the specified limit.
Should add a check at least upon group creation time that tracks, for each created group, how deep the tree below it is, and if that matches the context's configured depth.
For recursion depth of 6 owl prime sample generates 16M triangles, and runs through.
For depth 7 and above (where it should produce 64M+ triangles), I currently get this:
error (/home/iwald/Projects/owl/owl/ll/TrianglesGeomGroup.cpp: line 227): an illegal memory access was encountered
For simpe use cases, it'll be common to only change raygen values; doesn' tmake sense to force a omplete SBT rebuilt. Todo: add function that allows to rebuilt only the raygen and/or miss program(s).
Simplest way to reproduce is to run the interactive sample like this: CUDA_VISIBLE_DEVICES=1 ./int01-simpleTriangles
. This Invalid Denice Ordinal error occurs in groupBuildAccel
in DeviceGroup.cpp
I don't seem to run into the same issue with CUDA_VISIBLE_DEVICES=1 ./sample01-simpleTriangles. However, I do see the above exception with ViSII
Running into the above exception when using managed memory on my local environment. It seems we can fix this issue by checking if this cudaMemAdvise is supported by the current device, and fall back to not advising if it's not supported.
ie, have one function that creates the group (and that also specifies num time steps for motion blut), then another functoin to set all children, one to set all transforms, etc.
make cmake try to find TBB, and add single-threaded fallback if not found.
just too much trouble asking users to install tbb on windows.
In ViSII, I link to GLFW statically before adding the owl subdirectory to my project.
Before, I disabled OWL from also linking in GLFW statically through the build samples option.
Now that that's commented out, I cannot disable OWL from linking in GLFW statically, and it's breaking my parent project's build.
starting with something around driver 440.40, the sierpinski sample no longer works with rec depth > 2. Already investigating if this is a bug in owl, or whether the driver changed how multi-level instancing needs to be used.
current use of multi-device in even intro samples is confusing. suggest changing all samples ot use only one GPU, then have a separate sample that shows just multi-device.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.