Git Product home page Git Product logo

compushady's Introduction

compushady

Python module for easily running Compute Shaders

compushadyMandlebrot

Join the Discord server for support: https://discord.gg/2WvdpkYXHW

d3d12 (Windows), vulkan (Linux, Mac and Windows), and metal (Mac) are supported.

You can write shaders in HLSL and they will be compiled into the appropriate format (DXIL, DXBC, SPIR-V, MSL ...) automatically (using the DXC compiler included in the module as well as SPIRV-Cross).

The Mandlebrot anim gif you are seeing above has been generated by compushady itself: https://github.com/rdeioris/compushady/blob/main/examples/mandlebrot_gif.py

If you are looking for something cool, try this naive pong implementation: https://github.com/rdeioris/compushady/blob/main/examples/pong.py

Python 3.6 is the minimal supported version and you obviously need a system with at least one GPU.

If you want to run compushady on RaspberryPi 4 check here: https://github.com/rdeioris/compushady/blob/main/README.md#raspberrypi-4-vulkan-support

Consider sponsoring the project becoming a patron: https://www.patreon.com/rdeioris

Quickstart

pip install compushady

Note for Linux: (if you are building from sources, be sure vulkan and x11 headers are installed, libvulkan-dev and libx11-dev on a debian based distribution)

Note for Mac: (the vulkan headers will be searched in /usr/local or by reading the VULKAN_SDK environment variable, otherwise only the metal backend will be built)

Note for Windows: (the vulkan headers will be searched by reading the VULKAN_SDK environment variable, automatically set by the LunarG installer. If no Vulkan SDK is found only the d3d12 backend will be built)

Enumerate compute devices

import compushady

for device in compushady.get_discovered_devices():
    name = device.name
    video_memory_in_mb = device.dedicated_video_memory // 1024 // 1024
    print('Name: {0} Dedicated Memory: {1} MB'.format(name, video_memory_in_mb))

Upload data to a Texture

A Texture is an object available in the GPU memory. To upload data into it you need a so-called 'staging buffer'. A staging buffer is a block of memory allocated in your system ram that is mappable from your GPU. Using it the GPU can copy data from that buffer to the texture memory. To wrap up: if you want to upload data to the GPU, you first need to map a memory area in your ram (a buffer) and then ask the GPU to copy it in its memory (this assumes obviously a discrete GPU with on-board memory)

from compushady import Buffer, Texture2D, HEAP_UPLOAD
from compushady.formats import R8G8B8A8_UINT

# creates a 8x8 texture in GPU with the classic RGBA 8 bit format
texture = Texture2D(8, 8, R8G8B8A8_UINT)  
# creates a staging buffer with the right size and in memory optimized for uploading data
staging_buffer = Buffer(texture.size, HEAP_UPLOAD)
# upload a bunch of pixels data into the staging_buffer
staging_buffer.upload(b'\xff\x00\x00\xff') # first pixel as red
# copy from the staging_buffer to the texture
staging_buffer.copy_to(texture)

Reading back from GPU memory to system memory

Now that you have your data in GPU memory, you can manipulate them using a compute shader but, before seeing this, we need to learn how to copy back data from the texture memory to our system ram. We need a buffer again (this time a readback one):

from compushady import HEAP_READBACK, Buffer, Texture2D, HEAP_UPLOAD
from compushady.formats import R8G8B8A8_UINT

# creates a 8x8 texture in GPU with the classic RGBA 8 bit format
texture = Texture2D(8, 8, R8G8B8A8_UINT)  
# creates a staging buffer with the right size and in memory optimized for uploading data
staging_buffer = Buffer(texture.size, HEAP_UPLOAD)
# upload a bunch of pixels data into the staging_buffer
staging_buffer.upload(b'\xff\x00\x00\xff') # first pixel as red
# copy from the staging_buffer to the texture
staging_buffer.copy_to(texture)

# do something with the texture...

# prepare the readback buffer
readback_buffer = Buffer(texture.size, HEAP_READBACK)
# copy from texture to the readback buffer
texture.copy_to(readback_buffer)

# get the data as a python bytes object (just the first 4 bytes)
print(readback_buffer.readback(4))

Your first compute shader

We are going to run code in the GPU! We will start with simple logic: we will just swap the red channel with the green one. For doing this we need to write an HLSL shader that will take our texture as an input/output object:

from compushady import HEAP_READBACK, Buffer, Texture2D, HEAP_UPLOAD, Compute
from compushady.formats import R8G8B8A8_UINT
from compushady.shaders import hlsl

# creates a 8x8 texture in GPU with the classic RGBA 8 bit format
texture = Texture2D(8, 8, R8G8B8A8_UINT)
# creates a staging buffer with the right size and in memory optimized for uploading data
staging_buffer = Buffer(texture.size, HEAP_UPLOAD)
# upload a bunch of pixels data into the staging_buffer
staging_buffer.upload(b'\xff\x00\x00\xff')  # first pixel as red
# copy from the staging_buffer to the texture
staging_buffer.copy_to(texture)

# do something with the texture...

shader = """
RWTexture2D<uint4> texture : register(u0);
[numthreads(2, 2, 1)]
void main(int3 tid : SV_DispatchThreadID)
{
    uint4 color = texture[tid.xy];
    uint red = color.r;
    color.r = color.g;
    color.g = red;
    texture[tid.xy] = color;
}
"""
compute = Compute(hlsl.compile(shader), uav=[texture])
compute.dispatch(texture.width // 2, texture.height // 2, 1)

# prepare the readback buffer
readback_buffer = Buffer(texture.size, HEAP_READBACK)
# copy from texture to the readback buffer
texture.copy_to(readback_buffer)

# get the data as a python bytes object (just the first 8 bytes)
print(readback_buffer.readback(8))

API

This section covers the compushady API in detail (class by class)

compushady.Device

This class represents a compute device (a GPU generally) on your system.

There can be multiple devices on a system, by default compushady will always choose the one with most dedicated memory (but you are free to specify a device whenever you create a resource)

As already seen you can get the list of devices using compushady.get_discovered_devices() or retrieve the current 'best' one with compushady.get_best_device()

A compushady.Device object has the following fields:

  • name: a string with the device description
  • dedicated_video_memory: the amount (in bytes) of on-board (GPU) memory
  • dedicated_system_memory: the amount of system memory the OS has dedicated to the GPU (generally meaningful only on Windows)
  • shared_system_memory: the amount of system memory usable by the device (GPU)
  • vendor_id: an integer representing the vendor id code
  • device_id: an integer representing the device id code
  • is_hardware: True if it is a hardware devices (not an emulated one)
  • is_discrete: True if it is a discrete adapter (a dedicated GPU)

The compushady.get_current_device() function returns the currently set GPU device, you can override the current device using compushady.set_current_device(index) where 'index' is the index of one of the elements returned by compushady.get_discovered_devices(). You can change the current device even from the command line using the COMPUSHADY_DEVICE environment variable:

COMPUSHADY_DEVICE=2 python3 your_gpu_app.py

compushady.Buffer

This class represents a resource accessible by the GPU that can be in system RAM or GPU dedicated memory. Buffers are generic blobs of data that you can use as a plain storage for your compute shaders or staging/readback buffers when dealing with textures.

When you create a Buffer you need to specify its dimension (in bytes) and (optionally) the type of memory he needs to use: HEAP_DEFAULT (GPU memory), HEAP_UPLOAD (system memory optimized for writing) or HEAP_READBACK (system memory optimized for reading)

import compushady

buffer_in_gpu = compushady.Buffer(64)
buffer_in_gpu2 = compushady.Buffer(128, compushady.HEAP_DEFAULT)
staging_buffer = compushady.Buffer(64, compushady.HEAP_UPLOAD)
readback_buffer = compushady.Buffer(256, compushady.HEAP_READBACK)

Buffers created with HEAP_UPLOAD exposes the upload(data, offset=0) and upload2d(data, row_pitch, height, bytes_per_pixel) methods

Buffers created with HEAP_READBACK exposes the readback(size=0, offset=0), readback2d(row_pitch, height, bytes_per_pixel) and readback_to_buffer(buffer, offset=0) methods

Buffers expose the size property returning the size in bytes.

Buffer can even be structured and formatted:

This is an HLSL shader using a StructuredBuffer object

struct Data
{
    uint field0;
    float field1;
};
RWStructuredBuffer<Data> buffer: register(u0);
[numthreads(1, 1, 1)]
void main()
{
    buffer[0].field0 = 1;
    buffer[0].field1 = 2.0;
    buffer[1].field0 = 4;
    buffer[1].field1 = 4.0;
}

You can create a 'structured' buffer by passing the stride option with the size of the structure:

# will create a buffer of 16 bytes, divided in two structures of 8 bytes each (the struct Data)
structured_buffer = Buffer(size=16, stride=8)

Or you can use a typed buffer:

RWBuffer<float4> buffer: register(u0);
[numthreads(1, 1, 1)]
void main()
{
    buffer[0] = float4(1, 2, 3, 4);
    buffer[1] = float4(5, 6, 7, 8);
}

For which you can specify the format parameter:

# will create a buffer of 32 bytes, divided in two 16 bytes blocks, each one representing 4 32bits float values)
typed_buffer = Buffer(size=32, format=compushady.formats.R32G32B32A32_FLOAT)

compushady.Texture2D

A Texture2D object is a bidimensional (width and height) texture available in the GPU memory. You can read it from your Compute shader or blit it to a Swapchain. For creating a Texture2D object you need to specify its width, height and the pixel format.

from compushady import Texture2D
from compushady.formats import R8G8B8A8_UINT
texture = compushady.Texture2D(1024, 1024, R8G8B8A8_UINT)

Textures memory is always in GPU, so whenever you need to access (read or write) pixels/texels data of a texture from your main program, you need a staging or a readback buffer.

compushady.Texture1D and compushady.Texture3D

They are exactly like Texture2D but monodimensional (height=1) for Texture1D and tridimensional for Texture3D (you can see it as a group of slices, each one containing a bidimensional texture). For Texture3D you need to specify an additional parameter (the depth) representing the number of 'slices':

from compushady import Texture3D
from compushady.formats import R8G8B8A8_UINT
texture = compushady.Texture3D(1024, 1024, 4, R8G8B8A8_UINT) # you can see it as 4 1024x1024 textures

All of the textures types expose the following properties:

  • width
  • height
  • depth
  • row_pitch (bytes for each line)
  • size (dimension of the texture in bytes)

compushady.Compute

To build a Compute object (the one running the compute shader), you need (obviously) a shader blob (you can build it using the compushady.shaders.hlsl.compile function) and the resources (buffers and textures) you want to manage in the shader itself.

Note: while on DirectX based backends you need a DXIL/DXCB shader blob, on Vulkan any shader compiler able to generate SPIR-V blobs will be good for compushady (you can even precompile your shaders and store the SPIR-V blobs on files that you can load in compushady)

compushady uses the DirectX12 naming conventions: CBV (Constant Buffer View) for constant buffers (generally little amount of data that do not change during the compute shader execution), SRV (Shader Resource View) for buffers and textures you need to read in the shader, and UAV (Unordered Access View) for buffers and textures that need to be written by the shader.

This is a quick Compute object implementing a copy from a texture (the SRV, filled with random data uploaded in a buffer) to another one (the UAV) doubling pixel values:

from compushady import HEAP_UPLOAD, Buffer, Texture2D, Compute
from compushady.formats import R8G8B8A8_UINT
from compushady.shaders import hlsl
import random

source_texture = Texture2D(512, 512, R8G8B8A8_UINT)
destination_texture = Texture2D(512, 512, R8G8B8A8_UINT)

staging_buffer = Buffer(source_texture.size, HEAP_UPLOAD)
staging_buffer.upload(bytes([random.randint(0, 255) for i in range(source_texture.size)]))

staging_buffer.copy_to(source_texture)

shader = """
Texture2D<uint4> source : register(t0);
RWTexture2D<uint4> destination : register(u0);

[numthreads(8,8,1)]
void main(uint3 tid : SV_DispatchThreadID)
{
    destination[tid.xy] = source[tid.xy] * 2;
}
"""

compute = Compute(hlsl.compile(shader), srv=[source_texture], uav=[destination_texture])
compute.dispatch(source_texture.width // 8, source_texture.height // 8, 1)

What is that 8 value ?

A compute shader can be executed in parallel, with a level of parallelism (read: how many cores will run that code) specified by the numthreads attribute in the shader. That is a tridimensional value, so to get the final number of running threads you need to do x * y * z. In our case 64 threads can potentially run in parallel.

As we want to process 512 * 512 elements (the pixels in the texture) we want to run the compute shader only the amount of times required to fill the whole texture (the amount of executions are the arguments of the dispatch method, again as a tridimensional value)

If you set the arguments of dispatch to (1,1,1) you will only copy the top left 8x8 quad of the texture.

Cool, but how can i check if everything worked well ?

We have obviously tons of ways, but let's use a popular one: Pillow (append that code after the dispatch() call)

from PIL import Image

readback_buffer = Buffer(source_texture.size, HEAP_READBACK)
destination_texture.copy_to(readback_buffer)

image = Image.frombuffer('RGBA', (destination_texture.width,
                         destination_texture.height), readback_buffer.readback())
image.show()

If everything goes well a window should open with a 512x512 image with random pixel colors.

Try experimenting with different dispatch() arguments to see how the behaviour changes.

compushady.Heap

By default resources (Buffers, Textures) automatically allocates memory based on the heap type. If you want to have more control over memory allocations, you can independently allocate memory blocks (heaps) and then map resources to them (or part of them):

from compushady import HEAP_UPLOAD, Buffer, Heap

heap_upload = Heap(HEAP_UPLOAD, 256 * 1024) # allocates 256k of memory
buffer0 = Buffer(
    size=256, heap_type=HEAP_UPLOAD, heap=heap_upload, heap_offset=64 * 1024
) # maps a 256 bytes buffer on heap_upload at offest 64k
buffer1 = Buffer(
    size=256, heap_type=HEAP_UPLOAD, heap=heap_upload, heap_offset=128 * 1024
) # maps a 256 bytes buffer on heap_upload at offest 128k
buffer_all = Buffer(
    size=256 * 1024, heap_type=HEAP_UPLOAD, heap=heap_upload, heap_offset=0
) # maps the whole heap in a buffer

Notes:

  • the heap type specified for the resource and the heap must be similar.
  • the size of the requested resource and the heap is always checked
  • Textures only support HEAP_DEFAULT

compushady.Sampler

Samplers are used for retrieving pixels from textures using various forms of filtering and addressing.

In addition to CBV, SRV and UAV a "samplers" parameters is available when creating Computes object.

sampler = Sampler(
            address_mode_u=SAMPLER_ADDRESS_MODE_CLAMP,
            address_mode_v=SAMPLER_ADDRESS_MODE_CLAMP,
            address_mode_w=SAMPLER_ADDRESS_MODE_CLAMP,
            filter_min=SAMPLER_FILTER_POINT,
            filter_mag=SAMPLER_FILTER_POINT,
        )

You can sample textures from your shader like in a common graphics shader (with the difference that you need to specify the MIP level manually)

SamplerState sampler0;
Texture2D<float4> source;
RWTexture2D<float4> target;

[numthreads(1,1,1)]
void main(int3 tid : SV_DispatchThreadID)
{
    target[tid.xy] = source.SampleLevel(sampler0, float2(float(tid.x), float(tid.y)), 0);
}

Supported addressing modes:

  • SAMPLER_ADDRESS_MODE_WRAP
  • SAMPLER_ADDRESS_MODE_MIRROR
  • SAMPLER_ADDRESS_MODE_CLAMP

Supported filters:

  • SAMPLER_FILTER_POINT
  • SAMPLER_FILTER_LINEAR

compushady.Swapchain

While very probably you are going to run compushady in a headless environment, the module exposes a Swapchain object for blitting your textures on a window. For creating a swapchain you need to specify a window handle to attach to (this is operating system dependent), a format (generally B8G8R8A8_UNORM) and the number of buffers (generally 3). When you want to 'blit' a texture to the swapchain you just call the compushady.Swapchain.present(texture, x=0, y=0) method (with x and y you can specify where to blit the texture). The Swapchain always waits for the VSYNC.

This is an example of doing it using the glfw module (with a swapchain with 3 buffers):

import glfw
from compushady import HEAP_UPLOAD, Buffer, Swapchain, Texture2D
from compushady.formats import B8G8R8A8_UNORM
import platform
import random

glfw.init()
# we do not want implicit OpenGL!
glfw.window_hint(glfw.CLIENT_API, glfw.NO_API)

target = Texture2D(256, 256, B8G8R8A8_UNORM)
random_buffer = Buffer(target.size, HEAP_UPLOAD)

window = glfw.create_window(
    target.width, target.height, 'Random', None, None)

if platform.system() == 'Windows':
    swapchain = Swapchain(glfw.get_win32_window(
        window), B8G8R8A8_UNORM, 3)
else:
    swapchain = Swapchain((glfw.get_x11_display(), glfw.get_x11_window(
        window)), B8G8R8A8_UNORM, 3)

while not glfw.window_should_close(window):
    glfw.poll_events()
    random_buffer.upload(bytes([random.randint(0, 255), random.randint(
        0, 255), random.randint(0, 255), 255]) * (target.size // 4))
    random_buffer.copy_to(target)
    swapchain.present(target)

swapchain = None  # this ensures the swapchain is destroyed before the window

glfw.terminate()

RaspberryPi 4 Vulkan support

Recent mesa distributions include a Vulkan rpi4 (V3D) driver.

Compushady supports it (included the DXC compiler) in both 32 and 64 bit mode.

Note that float16 support is not available (so formats like R16G16B16A16_FLOAT are not going to work)

The module will be built from sources (it will take a bunch of minutes) when doing a pip install compushady so ensure to have a c/c++ compiler and the python3 headers installed.

WIP

  • Texture mips support
  • Support for inline raytracing
  • Support for mesh shaders
  • Add naga shaders compiler for GLSL and WGSL (https://crates.io/crates/naga)

Multithreading

You can run/create object in threads and run them concurrently (the backends release the GIL while the GPU taska are running)

Backends

There are currently 3 backends for GPU access: vulkan, metal and d3d12 (on older compushady versions, a d3d11 backend ws availabel too, but it has been removed to simplify the code base)

There is one shader backend for HLSL (GLSL support is in progress) based on Microsoft DXC, but (on vulkan) you can use any SPIR-V blob by passing it as the first argument of the compushady.Compute initializer.

Dealing with backends differences

Bindings

In HLSL you assign resources to registers/slot using 3 main concepts: cbv, srv, uav and their related registers (b0, t0, t1, u0,...)

When converting HLSL to SPIR-V an offset is added to SRVs (1024) and UAVs (2048), so b0 will be mapped to SPIR-V binding 0, b1 to 1, t0 to 1024, t2 to 1026 and u0 to 2048. Remember this when using SPIR-V directly.

When converting HLSL to MSL the code is first translated to SPIR-V and finally remapped to MSL. Here the mapping is again different to overcome metal approach to resource binding. In metal we have only two (from the compushady point of view) kind of resources: buffers and textures. When doing the conversion the SPIR-V bindings are just trashed and instead a sequential model is applied: from the lowest binding id just assign the next index based on the type of SPIR-V resource (Uniform is a buffer, ConstantUniform is a texture). Example: 2048 uniform will be buffer(1), 2049 constant uniform will be texture(0), 1025 uniform will be buffer(0). Remember the numerical order is always relevant!

Known Issues

  • The Metal backend does not support x and y for the Swapchain (so you will always blit the texture at 0, 0)
  • There is some alignment issues on older Metal GPUs, generally the alignment of structured/formatted buffer should be handled in a more robust way (but Apple Silicon GPUs generally work without issues)

compushady's People

Contributors

pomettini avatar rdeioris avatar unbit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

compushady's Issues

Unable Map() ID3D12Resource1: The GPU device instance has been suspended

Not sure how this error happened but it started randomly and wont stop, please help!

Error:

Traceback (most recent call last):
  File "g:\SMS\LIBRARY\Casual\Compushady\Test_001\scripts\dev_001.py", line 22, in <module>
    run(None)
  File "g:\SMS\LIBRARY\Casual\Compushady\Test_001\scripts\dev_001.py", line 12, in run
    readb = texture.read()
  File "g:\SMS\LIBRARY\Casual\Compushady\Test_001\scripts\core.py", line 49, in read
    return readback_buffer.readback(self.texture.size)
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python310\lib\site-packages\compushady\__init__.py", line 108, in readback
    return self.handle.readback(buffer_or_size, offset)
Exception: Unable Map() ID3D12Resource1: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.

My Code:

Error happens in the "readb = texture.read()" line of the testing script below.
That calls a function in my wrapper class in core script below and the error happens here: "readback_buffer.readback(self.texture.size)"

testing script:

import core
import reader


def run(device):
    texture = core.GPUTexture(8, 8)

    print("DWADWAADW")

    texture.compute(reader.read_hlsl_file("shader_001.hlsl"))

    readb = texture.read() # ERROR HAPPENS HERE

    pixels = []

    for i in range(int(texture.size / 4)):
        j = i * 4
        pixels.append((readb[j + 0], readb[j + 1], readb[j + 2], readb[j + 3]))

    print(pixels)

core script

from compushady import Buffer, Texture2D, HEAP_UPLOAD, HEAP_READBACK, Compute
import compushady.formats as csformats
from compushady.shaders import hlsl
import re


class GPUTexture:
    def __init__(self, width=8, height=8) -> None:
        self.device = None
        self.format = csformats.R8G8B8A8_UINT
        self.create_texture(width, height)
        self.create_staging_buffer()
        self.write()

    @property
    def width(self):
        return self.texture.width

    @property
    def height(self):
        return self.texture.height

    @property
    def size(self):
        return self.texture.size

    def create_texture(self, width=8, height=8):
        self.texture = Texture2D(
            width, height, format=self.format, device=self.device)

    def create_staging_buffer(self):
        self.staging_buffer = self.create_buffer()

    def create_buffer(self):
        return Buffer(
            self.texture.size, HEAP_UPLOAD, format=self.format, device=self.device)

    def upload(self, data: bytes):
        return self.staging_buffer.upload(data)

    def write(self, texture=None):
        if texture is None:
            texture = self.texture
        return self.staging_buffer.copy_to(texture)

    def read(self):
        readback_buffer = self.create_buffer()
        self.texture.copy_to(readback_buffer)
        return readback_buffer.readback(self.texture.size)

    def compute(self, shader):
        compiled = hlsl.compile(shader)
        compute = Compute(compiled, uav=[
                          self.texture], device=self.device)
        numgroups = self.get_numthreads(shader)

        nx = 2
        ny = 2

        if numgroups is not None:
            nx = numgroups[0]
            ny = numgroups[1]

        compute.dispatch(self.texture.width // nx,
                         self.texture.height // ny, 1)

    def get_numthreads(self, shader):
        # Search for the numthreads attribute in the shader code
        match = re.search(r'\[numthreads\((\d+), (\d+), (\d+)\)\]', shader)

        # If the numthreads attribute is found, return it as a tuple of integers
        if match:
            return tuple(map(int, match.groups()))

        # If the numthreads attribute is not found, return None
        return None

Cannot install on Windows, `d3d12.cpp` not found

Whilst running pip install compushady
Using a Microsoft Surface Laptop Go with an Intel i5-1035G1, which supports DirectX 12.1

Collecting compushady
  Downloading compushady-0.17.1.tar.gz (67.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.1/67.1 MB 3.4 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: compushady
  Building wheel for compushady (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for compushady (pyproject.toml) did not run successfully.      
  │ exit code: 1
  ╰─> [28 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-cpython-311
      creating build\lib.win-amd64-cpython-311\compushady
      copying compushady\config.py -> build\lib.win-amd64-cpython-311\compushady  
      copying compushady\formats.py -> build\lib.win-amd64-cpython-311\compushady 
      copying compushady\__init__.py -> build\lib.win-amd64-cpython-311\compushady
      creating build\lib.win-amd64-cpython-311\compushady\shaders
      copying compushady\shaders\hlsl.py -> build\lib.win-amd64-cpython-311\compushady\shaders
      copying compushady\shaders\msl.py -> build\lib.win-amd64-cpython-311\compushady\shaders
      copying compushady\shaders\__init__.py -> build\lib.win-amd64-cpython-311\compushady\shaders
      creating build\lib.win-amd64-cpython-311\compushady\backends
      copying compushady\backends\dxcompiler.dll -> build\lib.win-amd64-cpython-311\compushady\backends
      copying compushady\backends\dxil.dll -> build\lib.win-amd64-cpython-311\compushady\backends
      running build_ext
      building 'compushady.backends.d3d12' extension
      creating build\temp.win-amd64-cpython-311
      creating build\temp.win-amd64-cpython-311\Release
      creating build\temp.win-amd64-cpython-311\Release\compushady
      creating build\temp.win-amd64-cpython-311\Release\compushady\backends
      "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\Craig\AppData\Local\Programs\Python\Python311\include -IC:\Users\Craig\AppData\Local\Programs\Python\Python311\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program 
Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /EHsc /Tpcompushady/backends/common.cpp /Fobuild\temp.win-amd64-cpython-311\Release\compushady/backends/common.obj
      common.cpp
      "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -IC:\Users\Craig\AppData\Local\Programs\Python\Python311\include -IC:\Users\Craig\AppData\Local\Programs\Python\Python311\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program 
Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /EHsc /Tpcompushady/backends/d3d12.cpp /Fobuild\temp.win-amd64-cpython-311\Release\compushady/backends/d3d12.obj
      d3d12.cpp
      c1xx: fatal error C1083: Cannot open source file: 'compushady/backends/d3d12.cpp': No such file or directory
      error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.35.32215\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2        
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for compushady
Failed to build compushady
ERROR: Could not build wheels for compushady, which is required to install pyproject.toml-based projects

SampleLevel SamplerState Issue

Hello I'm currently working on a compute shader and I'm doing a sampling step in the compute shader, right now I'm unsure how or if compushady can do samplerstate for sampling the texture. When I run this code

SamplerState gSampler : register(s0);

// Function to get height from the height map at specified coordinates and level
float getH(float2 uv, int srcLevel)
{
    return heightMap.SampleLevel(gSampler, uv, srcLevel).r;
}

I will get this error in the Python output

Traceback (most recent call last):
  File "C:\Users\miguel.santiago\Desktop\PythonConeStep\compushade_06.py", line 222, in <module>
    compute = Compute(hlsl.compile(shader_source), srv=[source_texture], uav=[destination_texture])
  File "C:\Users\miguel.santiago\Desktop\PythonConeStep\myenv\lib\site-packages\compushady\__init__.py", line 222, in __init__
    uav=[resource.handle for resource in uav])
Exception: Unable to create Compute Pipeline State: The parameter is incorrect.

ID3D12Device::CreateComputePipelineState: Root Signature doesn't match Compute Shader: Shader sampler descriptor range (BaseShaderRegister=0, NumDescriptors=1, RegisterSpace=0) is not fully bound in root signature

[Windows][DX12] mandlebrot example crashes

After some seconds of execution, the examples crashes with the following log:

Traceback (most recent call last):
  File ".\test_compushady.py", line 74, in <module>
    swapchain.present(target)
  File "C:\Users\franc\AppData\Local\Programs\Python\Python38\lib\site-packages\compushady\__init__.py", line 202, in present
    self.handle.present(resource.handle, x, y)
Exception: unable to Present() Swapchain: Istanza del dispositivo GPU sospesa. Utilizzare GetDeviceRemovedReason per determinare l'azione appropriata.

ID3D12CommandQueue::ExecuteCommandLists: Command lists must be successfully closed before execution.
ID3D12Device::RemoveDevice: Device removal has been triggered for the following reason (DXGI_ERROR_INVALID_CALL: There is strong evidence that the application has performed an illegal or undefined operation, and such a condition could not be returned to the application cleanly through a return code).
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12CommandList::Dispatch: No pipeline state has been set in this command list.  The runtime will use a default no-op pipeline state.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12CommandQueue::ExecuteCommandLists: Command lists must be successfully closed before execution.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12GraphicsCommandList::*: This API cannot be called on a closed command list.
ID3D12CommandQueue::ExecuteCommandLists: Command lists must be successfully closed before execution.

Working with float4s

Hi again.

Currently I am attempting to use the "R16G16B16A16_FLOAT" format. I want to work with float4s instead of the uint4s in the demo. In my hlsl file I made the appropriate corrections for using float4s.

However, when reading the buffer back to the cpu, im not getting what I would expect. When I have a 1x1 texture i get back a bytes object of length 8. When its a 2x2 texture i get a 272 length bytes object. And when I use an 8x8 texture I get a bytes object of length 1856. All of which confuse me as I thought that I should be getting back a length equal to the width x height x depth. Where width and height are from the texture size and depth is the format (16 x 4 = 64 for the format R16G16B16A16_FLOAT).

Can you help me understand thee result

HLSL

RWTexture2D<float4> texture : register(u0);
[numthreads(8, 8, 1)]
void main(int3 id : SV_DispatchThreadID)
{
    float4 color = float4(1.0, 2.0, 3.0, 4.0);
    texture[id.xy] = color;
}

And I have made sure that the return buffer is using HEAP_READBACK in the code

create_swapchain: TypeError: function takes exactly 3 arguments (5 given)

Hello,
i was trying the Swapchain example from your readme:

import glfw
from compushady import HEAP_UPLOAD, Buffer, Swapchain, Texture2D
from compushady.formats import B8G8R8A8_UNORM
import platform
import random

glfw.init()
# we do not want implicit OpenGL!
glfw.window_hint(glfw.CLIENT_API, glfw.NO_API)

target = Texture2D(256, 256, B8G8R8A8_UNORM)
random_buffer = Buffer(target.size, HEAP_UPLOAD)

window = glfw.create_window(
    target.width, target.height, 'Random', None, None)

if platform.system() == 'Windows':
    swapchain = Swapchain(glfw.get_win32_window(
        window), B8G8R8A8_UNORM, 3)
else:
    swapchain = Swapchain((glfw.get_x11_display(), glfw.get_x11_window(
        window)), B8G8R8A8_UNORM, 3)

while not glfw.window_should_close(window):
    glfw.poll_events()
    random_buffer.upload(bytes([random.randint(0, 255), random.randint(
        0, 255), random.randint(0, 255), 255]) * (target.size // 4))
    random_buffer.copy_to(target)
    swapchain.present(target)

swapchain = None  # this ensures the swapchain is destroyed before the window

glfw.terminate()

but i got his error:

  File "testing.py", line 18, in <module>
    swapchain = Swapchain(glfw.get_win32_window(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "\venv\Lib\site-packages\compushady\__init__.py", line 199, in __init__
    self.handle = self.device.create_swapchain(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: function takes exactly 3 arguments (5 given)

i can't look deep enough into your code to determine what went wrong, any help is appreciated.
I am using Python 3.11, compushady 0.17.2, Windows 64 bit

If it is of any interest, i got here because i was looking foreward to play around with this repository but encountered this error:
https://github.com/HenBOMB/Py-Slime-Simulation/tree/main

can't install via pip missing metal.m file

Windows ans Linux work fine, only having trouble on the mac.

Collecting compushady
  Using cached compushady-0.17.1.tar.gz (67.1 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: compushady
  Building wheel for compushady (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for compushady (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [25 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-13.0-x86_64-cpython-312
      creating build/lib.macosx-13.0-x86_64-cpython-312/compushady
      copying compushady/config.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady
      copying compushady/__init__.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady
      copying compushady/formats.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady
      creating build/lib.macosx-13.0-x86_64-cpython-312/compushady/shaders
      copying compushady/shaders/__init__.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady/shaders
      copying compushady/shaders/msl.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady/shaders
      copying compushady/shaders/hlsl.py -> build/lib.macosx-13.0-x86_64-cpython-312/compushady/shaders
      creating build/lib.macosx-13.0-x86_64-cpython-312/compushady/backends
      copying compushady/backends/libdxcompiler.3.7.dylib -> build/lib.macosx-13.0-x86_64-cpython-312/compushady/backends
      running build_ext
      building 'compushady.backends.metal' extension
      creating build/temp.macosx-13.0-x86_64-cpython-312
      creating build/temp.macosx-13.0-x86_64-cpython-312/compushady
      creating build/temp.macosx-13.0-x86_64-cpython-312/compushady/backends
      clang -fno-strict-overflow -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk -I/Users/farzon/Projects/ipython_demo/.venv/include -I/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.12/include/python3.12 -c compushady/backends/common.cpp -o build/temp.macosx-13.0-x86_64-cpython-312/compushady/backends/common.o -ObjC++ -Wno-unguarded-availability-new
      clang -fno-strict-overflow -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk -I/Users/farzon/Projects/ipython_demo/.venv/include -I/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.12/include/python3.12 -c compushady/backends/metal.m -o build/temp.macosx-13.0-x86_64-cpython-312/compushady/backends/metal.o -ObjC++ -Wno-unguarded-availability-new
      clang: error: no such file or directory: 'compushady/backends/metal.m'
      clang: error: no input files
      error: command '/usr/local/bin/clang' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for compushady

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.