Comments (15)
Please, provide more information. Potential API changes would be especially helpful.
I thought it may be useful to have an ability to Map/Unmap ranges
I definitely misunderstand something, but current API allows to map a range:
void* (NRI_CALL *MapBuffer)(NRI_NAME_REF(Buffer) buffer, uint64_t offset, uint64_t size);
Is your concern is that it's only 1 range, but you want to get multiple? If yes, why not Map / Unmap several times? Or just not Map a bigger range?
you have two big vertex and index buffers and you only need to update some region of them at once)
Why is Map range -> Modify -> Unmap
not enough? BTW, IMO, better use UPLOAD
heap only for streaming and not for rendering (except minor lightweight passes). I.e.
A (resource for streaming in UPLOAD heap, ring buffer)
Map range in A with proper offset to avoid touching data referenced by GPU
Update
Unmap (persistent mapping is nice, but only if you don't care about D3D11)
Copy to a destination resource on DEVICE
from nri.
Is your concern is that it's only 1 range, but you want to get multiple? If yes, why not Map / Unmap several times? Or just not Map a bigger range?
Is this the correct use of this function? What about a views in with different offsets in the buffer/texture? I know that D3D12 and D3D11 supports multiple subranges update in one call to reduce GAPI call count.
Why is Map range -> Modify -> Unmap not enough? BTW, IMO, better use UPLOAD heap only for streaming and not for rendering (except minor lightweight passes). I.e.
I'm using DEVICE
heap for each buffer that is not updated every frame, the problem is that I can only update 1 range in helper function UploadData
. So I can't update subranges on device without copying unnecessary data.
from nri.
It seems to me that you have a resource in DEVICE
heap, which has several "assigned" SRVs (logical resources), which in order you want to update via a single Map / Unmap
call, right? If yes, bad usage. In my comment above I explained good practice,
UploadData
can be improved. Which API do you expect to see?
from nri.
which has several "assigned" SRVs (logical resources)
Not SRVs, CBVs. I'm using dynamic constant buffers for that case.
UploadData can be improved. Which API do you expect to see?
This function uses a queue to load resources and can only load one resource per desc. I think you can improve it by batch loading data into one larger HOST_UPLOAD
buffer and then updating a couple of resources at a time. Also, it is sometimes useful to schedule upload commands in a command list and then upload them all at once. Instead, right now I need to create an upload buffer for each resource I want to upload, or have a separate ring buffer that allows me to batch uploading as I said earlier.
from nri.
Ouch, I understood. The truth is that UploadData
has been designed for lazy guys. It's purpose is to upload data in the simplest potentially unoptimized way in LOADING phase. Kinda "fire and forget" way. Yes, I do agree that this function just doesn't suit for resources updating at run time.
What can we do? I think improving UploadData
itself can be a good start:
- replace
commandQueue
withcommandBuffer
- improve internal code (do more batching)
But, IMO, I'm personally planning to write my own streamer on "plain" NRI which is based on the stuff I already mentioned:
- pre-allocated ring buffer in
UPLOAD
heap - a user enqueues a request containing:
- what to upload
- destination "range" in a resource located in
DEVICE
heap
- streamer maps once and writes all requests in concatenated form
- then copies to the clients on GPU (better via COPY queue)
Not SRVs, CBVs. I'm using dynamic constant buffers for that case
Here is how I update "dynamic CB" in my project:
https://github.com/NVIDIAGameWorks/RayTracingDenoiser/blob/master/Integration/NRDIntegration.hpp#L590
from nri.
replace commandQueue with commandBuffer
I don't think this is a good solution, because you can use this function multiple times in different queues (for example: I want to transfer texture mips in the copy queue and load critical path textures in the graphics queue, but they will use the same load buffer. Oops).
I think the second solution would be preferable. Also, I would like to have a different instance of the ring buffer for each queue (because sometimes you need to load data into both queues at the same time).
Here is how I update "dynamic CB" in my project:
Thank you for sample. I'll take a look at it.
from nri.
I don't think this is a good solution, because you can use this function multiple times in different queues (for example: I want to transfer texture mips in the copy queue and load critical path textures in the graphics queue, but they will use the same load buffer. Oops).
To be honest don't see a problem. Queue
is used to create internal temp CommandBuffers
and submit work, and wait for idle. It's good for loading, not suitable for real time. I think, that if we pass CommandBuffer
instead this function will only write commands into the command buffer, but submitting the work and synchronization will move to the user side. Less convenient, but at least "non blocking". Not advertising this much, because UploadData
will be turned into something else.
from nri.
...but they will use the same load buffer
For clarity, UploadData
blocks execution, it "waits for idle" under the hood. Upload buffer is internal and different for each call of UploadData
.
from nri.
For clarity,
UploadData
blocks execution, it "waits for idle" under the hood. Upload buffer is internal and different for each call ofUploadData
.
That's what I don't like about it. I would prefer more manual management.
from nri.
OK, the intermediate conclusion:
UploadData
behaves how it has been designedUploadData
!=StreamData
(which currently doesn't exist)- I suggest you just take the contents of
UploadData
, modify according to your usage pattern and... later we can decide is there a sharable code, which can be organized intoStreamData
or is there a code blocks which can be used to improveUploadData
Deal? ;)
P.S. I made the following fixes so far in UploadData
:
- only 1 command allocator is needed even in case of mGPU
- removed unnecessary reference stored in the helper class
- NRI: removed
nodeMask
fromCreateCommandAllocator
(just not needed in any API)
from nri.
Deal? ;)
OK, why not. I'll get on it as soon as I get my hands on it.
from nri.
So I wrote snippet for this case, I think it might be helpful: https://gist.github.com/vertver/6f522aa83f09776d0ece7c11ea124835
from nri.
Thanks. I need to meditate on the code...
from nri.
Your snippet is useful, but I don't think it needs to be a part of NRI. Data streaming can be organized in one way or another, I would prefer to let people decide how to better solve this task on a higher level (application level). Even in your case, overflow is not handled: it might not be needed, or might be... Moreover, streaming
functionality doesn't fit into HelperInterface
, because it assumes "state" which need to be kept for some time, i.e. an object is needed. Probably, in the nearest future I will agree to add a standalone extension for data streaming, but not now. I see benefits though... Feel free to file a new issue "RFE: add streaming functionality for lazy guys" :)
Have we exhausted this issue? Can it be closed?
from nri.
Closing this one. Created a new issue #34 for adding "streaming" functionality.
from nri.
Related Issues (20)
- Wrong barriers for readback buffer HOT 3
- Unknown interface HelperInterface
- [RFE] SwapChain: explicitly expose "waitable" (currently always on in D3D) and add VK support HOT 1
- [RFE] Work graphs support
- [RFE] Shared resources via NT handle
- [RFE] Add support for `D3D12_QUERY_HEAP_TYPE_COPY_QUEUE_TIMESTAMP` HOT 1
- [RFE] Add "low latency" extension HOT 1
- Add non shader visible descriptor pool HOT 11
- Bug: X11 and Wayland should be treated mutually exclusive during swapchain creation HOT 2
- [RFE/VK] Allow disabling bound checks via `VK_EXT_robustness2` HOT 1
- overlapping incompatibility with VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT | VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
- [RFE] Instancing problem in GPU-driven rendering HOT 62
- clang-cl compile error HOT 6
- [BUG] PSL tier is needed instead of `isProgrammableSampleLocationsSupported` HOT 2
- [RFE] Draw Indirect Count HOT 2
- [RFE] Add ability to disable NVAPI and AMD AGS HOT 1
- Expose `AreEnhancedBarriersSupported` in device's desc HOT 2
- [RFE/D3D12] `drawID` emulation
- [RFE] Hook up `DirectSR` (and future VK ext)
- [RFE] Add support for queue priority
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nri.