Git Product home page Git Product logo

Comments (3)

krOoze avatar krOoze commented on May 29, 2024

Best questions are the self-answering ones. :p

I don't claim necessarily that my system is particularly scalable. Frankly, with low-level APIs one should be careful of "one-size-fits-all" mentality. E,g, sufficiently advanced engine might want to work with stuff across two frames (at the cost of latency), or it might have nontrivial synchronization with stuff not immediately related to rendering (sim, input, streaming, ...). If one wants the idealest way for particular usecase, one has to think about it bit wholistically and make a system that is ideal fit for that usecase specifically.

Fence on the submit has the nice property that it covers everything on the queue before it, plus it covers all the swapchain image acquires through inference via the semaphores the submits take as parameter. I find it more deterministic and less brainfucky.

Fence on acquire has the undesired property that it covers only the acquire. By inferrence you could also get that work on image K has been done. But you do not even know what K you will get from the acquire upfront. So this is somewhat unpredictable, while the fence-on-submit has the nice property that it is more round-robin (frame 0, frame 1, frame 0, frame 1, frame 0, frame 1) predictably, while I found dealing with the swapchain image id more convenient that way, because it can be more isolated from other stuff.

With extension there are\will be few more options to do things. Notably with VK_KHR_present_wait it could be possible to make a different system relying on that. I think this would be somewhat comparable to how modern Direct3D does it with GetFrameLatencyWaitableObject.

from hello_triangle.

plasmacel avatar plasmacel commented on May 29, 2024

I also found a now offline blog post from Timothy Lottes (for some reason he completely disappeared from the internet) about exploring Vulkan swapchain optimization and synchronization, where he claims:

Using the acquire fence to stall command buffer recording is a recipe for failure.

                   [__scanoutA__][__scanoutB__][____miss____]
        [__gpuA__][__gpuB__]     |      [__gpuA__] <- fail
[cpuA_][cpuB_]                   [cpuA_]
                                 |
                                 |<-- acquire fence

While an optimal double buffered FIFO/FIFO_RELAXED with wait-on-submit is

  • 2-deep swapchain with FIFO (or better FIFO_RELAXED) for v-sync
  • 2-frame-deep buffering of a frame's command buffers
  • Use at least one beginning-of-frame fence to stop begin/reset from happening during prior usage
  • Fence is set to trigger after last command buffer of that frame is done
  • Use at least 2 command buffers in the frame
  • One for all work done before writing into the swapchain (pre-acquire) ... [gpuA] below
  • A second for the pass which writes into the swapchain image (post-acquire) ... [A] below
  • The post-acquire command buffer waits on the acquire semaphore
  • The post-acquire command buffer signals the semaphore which present waits on
                          |<- Command buffer [gpuA] is free to start after [B] (no semaphore)
                          |
                          |    >||<- Command buffer [A] writes into swap (waits on acquire semaphore)
                          |     ||
                   [__scanoutA__][__scanoutB__][__scanoutA__][__scanoutB__]
        [gpuA][A][gpuB][B][gpuA] [A][gpuB]     [B][gpuA]     [A][gpuB]
[cpuA_][cpuB_]   [cpuA_]  [cpuB_]   [cpuA_]       [cpuB_]       [cpuA_]
                ||
               >||<- prior A frame finish triggers CPU to start on frame A again

It also proves that using wait-on-submit (your choice) you can get better CPU/GPU utilization and less v-sync misses.

Reference
https://web.archive.org/web/20180203060107/https://timothylottes.github.io/20180202.html

from hello_triangle.

krOoze avatar krOoze commented on May 29, 2024

Interesting. Though what I think applies at low-level is think for yourself. I can think of at least two historical occasions when everyone was doing things the wrong way in Vulkan (including and especially the vkcube example). There is certainly always a possibility that you simply know better than everyone if you try something new and contraintuitive. The only thing that is necessary is that the code is valid as far as the spec goes. Everything else is engineering choice\tradeoff. Another thing that may be problematic is the drivers sometimes have awkward and weird WSI behavior. That needs to be weeded out ideally for the sake of explicitness and for the benefit of all.

from hello_triangle.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.