SwiftFrameGraph

Note: The current version of SwiftFrameGraph on master does not have a working backend. If you're only targeting Metal, I'd suggest using the master branch, since there are significant fixes and API improvements. Otherwise, the last working version supporting Vulkan is on the Vulkan branch.

At some point in the hopefully-not-too-distant-future the Vulkan backend will be updated with the API changes and the fixes from the Metal backend.

What is this?

In short: this is a way to code against a higher-level, Swift-native, reduced-friction version of the Metal rendering API and have your rendering code run in a fairly efficient manner, cross-platform on both Metal and Vulkan.

This is the base rendering system for a game I'm working on with a few other people called Interdimensional Llama. More specifically, it's a platform-agnostic abstraction over a rendering API, combined with backends for Metal and Vulkan. Its design is heavily inspired by Metal; in fact, it started off as a direct overlay over Metal, and gradually diverged. As such, Apple's Metal documentation is probably the best general reference for RenderAPI, since it's what we referred to.

The Metal backend has received the most attention, and should be reasonably optimised and stable. The Vulkan backend works for our use cases but could have a few bugs and inefficiencies.

This project does not handle cross-compilation of shaders. For that, you'll want to use other tools; in our case, we've been writing our shaders once for each of Vulkan and Metal.

This early demo (which is not at all representative of the game or art style) is an example of something that was made using this framework.

What's the motivation?

Recently there's been an interest in render-graph based APIs for rendering, making it simpler to compose together multiple render passes and build frames. To my knowledge, the origin of this is in the Frostbite engine, as described by this talk, although many others have had their input since.

A key contribution of render graphs is that per-frame resource management is done automatically; rather than manually managing buffers and buffer pools, you can just request a new buffer each frame. Originally, our implementation was fairly closely based on the described design; however, we decided to take things one step further.

A major annoyance with APIs such as Vulkan is that you have to be explicit about everything. You need to describe how a buffer will be used, for example, or outline every barrier to insert. With Metal, the overhead is a little less, but as soon as you want to do manual resource tracking, or use resource heaps, or other such advanced features, you soon run into the same problem.

The main idea behind this framework is that if we execute in a deferred mode, we can infer a lot of things we'd otherwise need to specify; we can record a frame, tracking how each resource is used, and then execute the frame. Furthermore, we can infer how the resource is used based on information from shader reflection – we can tell whether an image is sampled, written to, or read from in a shader, for example. What I mean by deferred is basically that each render pass runs twice: the first time, the resource tracking code plans out what will be needed for the frame, and the second time the commands are actually called to the underlying API (Vulkan or Metal).

In practice, this has some really neat effects. For example, our engine uses clustered shading, and performs light culling on the CPU in a CPURenderPass. During some experiments, I commented out the lighting code in the shader that makes use of the clustered shading buffers. Given that, the shader compiler optimised out the use of those resources; then when the Frame Graph executed, it could see that the output of that pass was never used, and prevented it from being executed at all.

Another thing a deferred setup means is that per-frame resources can be handled extremely easily. Want a buffer with some data in it?

let vertexBufferDescriptor = BufferDescriptor(length: renderData.vertexBuffer.count * MemoryLayout<ImDrawVert>.size)
let vertexBuffer = Buffer(descriptor: vertexBufferDescriptor, bytes: renderData.vertexBuffer.baseAddress!)
renderEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0)

That will automatically get allocated with the correct flags and disposed without any GPU overhead (if you're curious how, take a look at the Metal and Vulkan backends, and in particular ResourceRegistry.swift). Dependency ordering is determined by the order that render passes are added to the FrameGraph.

This method is significantly simpler than the two-stage setup and resource handles used in e.g. Frostbite's design. You treat resources as simple objects, creating them when you need them and letting them be deallocated once you're done.

Persistent resources are a little trickier. To create them, you pass .persistent as a flag in the Buffer or Texture constructor, and you make sure that the usageHint in the descriptor matches how it's going to be used. If you make or dispose persistent resources frequently the code will still work, but you'll be losing out on a lot of the benefits and incurring a large overhead.

Where can I see how it works?

A good place to start would be the main render pass execution in CommandRecorder.swift.

With regards to the backends, take a look at the Vulkan and Metal backends.

How practical is this to use in my personal projects?

Honestly? Probably not very, although it'll work if you're determined and willing to look around the code base. Almost all of the documentation is contained within my and the other authors' heads, and there are a few edge cases or hidden functionality. While we'd like to properly document this project, it's a fairly low priority for us; we understand it, and we're the primary users, so that's enough for us in making our game. Long-term, however, we'd definitely like for this code-base to live up to the standards of a high-quality open source project.

We do plan to release a more full-featured example project, including a Dear ImGui render pass and a debug drawing tool. The problem is that the rest of the code is fairly tightly intertwined with our engine, and we're not wanting to open source our full engine - as such, it might be a while before we have time to release it.

With all that said, this is open source so others can use it! We'd welcome input to make it better or easier to use.

What does a Render Pass look like?

Here's an example of a draw render pass from our engine to draw debug shapes (wireframe outlines, points, and lines). It hasn't been cleaned up at all, so forgive the slight messiness.

final class DebugDrawPass : DrawRenderPass {

    static let pointLineVertexDescriptor : VertexDescriptor = {
        var descriptor = VertexDescriptor()

        // position
        descriptor.attributes[0].bufferIndex = 0
        descriptor.attributes[0].offset = 0
        descriptor.attributes[0].format = .float3

        // color
        descriptor.attributes[1].bufferIndex = 0
        descriptor.attributes[1].offset = 3 * MemoryLayout<Float>.size
        descriptor.attributes[1].format = .float4

        // point size
        descriptor.attributes[2].bufferIndex = 0
        descriptor.attributes[2].offset = 7 * MemoryLayout<Float>.size
        descriptor.attributes[2].format = .float


        descriptor.layouts[0].stepFunction = .perVertex
        descriptor.layouts[0].stepRate = 1
        descriptor.layouts[0].stride = MemoryLayout<DebugDraw.DebugDrawVertex>.size

        return descriptor
    }()

    static let depthStencilNoDepth : DepthStencilDescriptor = {
        var depthStencilDescriptor = DepthStencilDescriptor()
        depthStencilDescriptor.depthCompareFunction = .always
        depthStencilDescriptor.isDepthWriteEnabled = false
        return depthStencilDescriptor
    }()

    static let depthStencilWithDepth : DepthStencilDescriptor = {
        var depthStencilDescriptor = DepthStencilDescriptor()
        depthStencilDescriptor.depthCompareFunction = .greater
        depthStencilDescriptor.isDepthWriteEnabled = true
        return depthStencilDescriptor
    }()

    static let pipelineDescriptor : RenderPipelineDescriptor = {
        var descriptor = RenderPipelineDescriptor(identifier: ScreenRenderTargetIndex.self)

        var blendDescriptor = BlendDescriptor()

        blendDescriptor.alphaBlendOperation = .add
        blendDescriptor.rgbBlendOperation = .add
        blendDescriptor.sourceRGBBlendFactor = .sourceAlpha
        blendDescriptor.sourceAlphaBlendFactor = .sourceAlpha
        blendDescriptor.destinationRGBBlendFactor = .oneMinusSourceAlpha
        blendDescriptor.destinationAlphaBlendFactor = .oneMinusSourceAlpha

        descriptor[blendStateFor: ScreenRenderTargetIndex.display] = blendDescriptor

        descriptor.vertexDescriptor = DebugDrawPass.pointLineVertexDescriptor

        descriptor.vertexFunction = "debugDrawVertexLinePoint"
        descriptor.fragmentFunction = "debugDrawFragmentLinePoint"

        return descriptor
    }()


    let renderTargetDescriptor: RenderTargetDescriptor

    var name: String = "Debug Draw"

    let renderData : DebugDraw.RenderData
    let viewUniforms : ViewRenderUniforms

    let outputTexture: Texture

    init(renderData: DebugDraw.RenderData, outputTexture: Texture, viewUniforms: ViewRenderUniforms) {
        self.renderData = renderData
        self.viewUniforms = viewUniforms

        var renderTargetDesc = RenderTargetDescriptor(identifierType: ScreenRenderTargetIndex.self)
        renderTargetDesc[ScreenRenderTargetIndex.display] = RenderTargetColorAttachmentDescriptor(texture: outputTexture)
        renderTargetDesc[ScreenRenderTargetIndex.display]!.clearColor = ClearColor(red: 0.2, green: 0.6, blue: 0.9, alpha: 1.0)
	self.renderTargetDescriptor = renderTargetDesc
    }

    func execute(renderCommandEncoder: RenderCommandEncoder) {

        if self.renderData.vertexBuffer.isEmpty {
            return //early out
        }

        let vertexBuffer = Buffer(descriptor: BufferDescriptor(length: MemoryLayout<DebugDraw.DebugDrawVertex>.size * self.renderData.vertexBuffer.count), bytes: self.renderData.vertexBuffer.buffer)
        let indexBuffer = Buffer(descriptor: BufferDescriptor(length: MemoryLayout<UInt16>.size * self.renderData.indexBuffer.count), bytes: self.renderData.indexBuffer.buffer)

        renderCommandEncoder.setRenderPipelineState(DebugDrawPass.pipelineDescriptor)
        renderCommandEncoder.setCullMode(.none)
        renderCommandEncoder.setVertexBuffer(vertexBuffer, offset: 0, index: 0)
        renderCommandEncoder.setValue(self.viewUniforms, key: "drawVertexUniforms")

        var vertexBufferOffset = 0
        var indexBufferOffset = 0

        for drawCommand in self.renderData.commands {
            renderCommandEncoder.setVertexBufferOffset(vertexBufferOffset, index: 0)

            var primitiveType : PrimitiveType? = nil

            switch drawCommand.type {
                case .point:
                    primitiveType = .point
                case .line:
                    primitiveType = .line
                case .triangle:
                    primitiveType = .triangle
            }

            if let primitiveType = primitiveType {
                renderCommandEncoder.setDepthStencilState(drawCommand.depthEnabled ? DebugDrawPass.depthStencilWithDepth : DebugDrawPass.depthStencilNoDepth)

                if primitiveType == .point {
                    renderCommandEncoder.drawPrimitives(type: primitiveType, vertexStart: 0, vertexCount: drawCommand.vertexCount)
                } else {
                    renderCommandEncoder.drawIndexedPrimitives(type: primitiveType, indexCount: drawCommand.indexCount!, indexType: .uint16, indexBuffer: indexBuffer, indexBufferOffset: indexBufferOffset)
                    indexBufferOffset += drawCommand.indexCount! * MemoryLayout<UInt16>.size
                }

                vertexBufferOffset += drawCommand.vertexCount * MemoryLayout<DebugDraw.DebugDrawVertex>.size
            }
        }
    }
}

Every frame, you'd call something like this:

let debugDrawPass = DebugDrawPass(outputTexture: texture, renderData: debugDrawData, viewUniforms: viewUniforms)
FrameGraph.addPass(debugDrawPass)

and then execute the FrameGraph using a call like:

FrameGraph.execute(backend: self.backend)

where self.backend is an instance of either the Metal or Vulkan backends.

To get something to display on screen, you'd pass in a Texture from a window (take a look at the Windowing subdirectory) as the outputTexture. Otherwise, you could create or pass in a texture such as:

var textureDescriptor = TextureDescriptor.texture2DDescriptor(pixelFormat: .rgba16Float, width: Int(self.drawableSize.width), height: Int(self.drawableSize.height), mipmapped: false)
let emissiveTexture = Texture(descriptor: textureDescriptor)
let debugDrawPass = DebugDrawPass(outputTexture: emissiveTexture, renderData: debugDrawData, viewUniforms: viewUniforms, motionVectorsEnabled: renderSettings.temporalAAEnabled)

How do I build it?

macOS

Using a recent Swift toolchain, run ./build_metal_macos.sh from the cloned directory to generate an Xcode project, and then build the frameworks from within Xcode.

Linux

Untested, but using a recent Swift toolchain and running swift build in the cloned directory should be enough.

Windows

Swift on Windows is a very early work in progress, and while I can promise you this works, I can't really tell you how to build it. I will say that we cross-compile Windows binaries from Ubuntu for Windows, since Swift Package Manager is a long way from supporting Windows.

I've also included our overlays for Foundation for Windows. If this is interesting to you, make use of it however you please.

Why Swift?

I think Swift's a great language, and it's what we wanted to use when making our engine. In particular, the use case this was built for doesn't require much interoperability with existing C++ code, which enabled us to use a more modern and ergonomic language.

What about multithreading?

Interesting question. So, in theory, there's nothing that prevents render passes from being recorded in parallel, even though the current implementation of the resource usage tracking isn't thread safe. As for executing the render passes in the backend: the current paradigm is to iterate through a list of commands that need to be executed, where resources are materialised late and disposed early. If render passes are executed out of order, we need to somehow make sure that the resources are available for each pass when it needs them while simultaneously keeping the advantage of limited resource lifetimes. This isn't at all impossible, but it's significant engineering effort and isn't a high priority for us.

Why are you doing something in a particular way?

It's probably the first thing we thought of, although in some cases there's a very precise (and probably poorly documented) reason why something is the way it is. We're by no means experts; the very first graphics code I wrote for Windows was getting ImGui to show using this API and Vulkan!

What about (some other question here)?

Feel free to post an issue. If you're genuinely curious about getting something to work with this we'd be more than happy to help.

License

See the MIT license in LICENSE. Other licenses may apply to included libraries.

zententacles / swiftframegraph Goto Github PK