microsoft / directxshadercompiler Goto Github PK

This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.

License: Other

CMake 0.37% C++ 74.09% C 4.69% Shell 0.09% LLVM 7.71% Assembly 0.06% Makefile 0.01% Python 1.30% Perl 0.03% Batchfile 0.07% JavaScript 0.03% HLSL 10.87% Roff 0.01% PowerShell 0.01% Objective-C 0.01% HTML 0.21% Cuda 0.04% C# 0.40% AppleScript 0.01% Emacs Lisp 0.01%

hlsl shader-programs directx-shader-compiler dxil

directxshadercompiler's Introduction

DirectX Shader Compiler

The DirectX Shader Compiler project includes a compiler and related tools used to compile High-Level Shader Language (HLSL) programs into DirectX Intermediate Language (DXIL) representation. Applications that make use of DirectX for graphics, games, and computation can use it to generate shader programs.

For more information, see the Wiki.

Visit the DirectX Landing Page for more resources for DirectX developers.

Features and Goals

The starting point of the project is a fork of the LLVM and Clang projects, modified to accept HLSL and emit a validated container that can be consumed by GPU drivers.

At the moment, the DirectX HLSL Compiler provides the following components:

dxc.exe, a command-line tool that can compile HLSL programs for shader model 6.0 or higher
dxcompiler.dll, a DLL providing a componentized compiler, assembler, disassembler, and validator
dxilconv.dll, a DLL providing a converter from DXBC (older shader bytecode format)
various other tools based on the above components

The Microsoft Windows SDK releases include a supported version of the compiler and validator.

The goal of the project is to allow the broader community of shader developers to contribute to the language and representation of shader programs, maintaining the principles of compatibility and supportability for the platform. It's currently in active development across two axes: language evolution (with no impact to DXIL representation), and surfacing hardware capabilities (with impact to DXIL, and thus requiring coordination with GPU implementations).

Pre-built Releases

Development kits containing only the dxc.exe driver app, the dxcompiler.dll, and the dxil.dll signing binary are available here, or in the releases tab.

SPIR-V CodeGen

As an example of community contribution, this project can also target the SPIR-V intermediate representation. Please see the doc for how HLSL features are mapped to SPIR-V, and the wiki page for how to build, use, and contribute to the SPIR-V CodeGen.

Building Sources

See the full documentation for Building and testing DXC for detailed instructions.

Running Shaders

To run shaders compiled as DXIL, you will need support from the operating system as well as from the driver for your graphics adapter. Windows 10 Creators Update is the first version to support DXIL shaders. See the Wiki for information on using experimental support or the software adapter.

Hardware Support

Hardware GPU support for DXIL is provided by the following vendors:

NVIDIA

NVIDIA's r396 drivers (r397.64 and later) provide release mode support for DXIL 1.1 and Shader Model 6.1 on Win10 1709 and later, and experimental mode support for DXIL 1.2 and Shader Model 6.2 on Win10 1803 and later. These drivers also support DXR in experimental mode.

Drivers can be downloaded from geforce.com.

AMD

AMD’s driver (Radeon Software Adrenalin Edition 18.4.1 or later) provides release mode support for DXIL 1.1 and Shader Model 6.1. Drivers can be downloaded from AMD's download site.

Intel

Intel's 15.60 drivers (15.60.0.4849 and later) support release mode for DXIL 1.0 and Shader Model 6.0 as well as release mode for DXIL 1.1 and Shader Model 6.1 (View Instancing support only).

Drivers can be downloaded from the following link Intel Graphics Drivers

Direct access to 15.60 driver (latest as of of this update) is provided below:

Installer

Release Notes related to DXIL

Making Changes

To make contributions, see the CONTRIBUTING.md file in this project.

Documentation

You can find documentation for this project in the docs directory. These contain the original LLVM documentation files, as well as two new files worth nothing:

HLSLChanges.rst: this is the starting point for how this fork diverges from the original llvm/clang sources
DXIL.rst: this file contains the specification for the DXIL format
tools/clang/docs/UsingDxc.rst: this file contains a user guide for dxc.exe

License

DirectX Shader Compiler is distributed under the terms of the University of Illinois Open Source License.

See LICENSE.txt and ThirdPartyNotices.txt for details.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

directxshadercompiler's People

Contributors

Stargazers

Watchers

Forkers

tex3d python3kgae bkaradzic krytarowski marcelolr lcbasu lukasbanana dmpots tiago-rodrigues hsg556 noahzuo anydream slanterns-fork yak1990 corefan haxine cooljie2001 chenjinxian sweetwood vr3d yskim1501 yurido1 jasonwinston whaison xtr1m alex-leleka wangjunsheng ezhangle esboy1988 samana gaolf jayceefun nathanleroux sherief antiagainst ehsannas michaelkvance gomson madokakaroto tangent-vector google jholewinski yuanlink swordlegend alexxnica kryndex m0nguss boost2017 kangweon jeffnn bashbaug crysisfarcry222 leolorenzoluis lxq2537664558 jingjing54007 tomicyo alphafork dut3062796s cnsuhao forkrp pperehozhih adam-yang jnnpsubm linpingchuan tafuri threethreenine fromasmtodisasm cry-mory thomasraoux benconms markherdeg robin6667 daenecompass hekota wang-shijie grantri shawn-farkas-ms alopex6414 mackzhong vincentlin78 vcsharma exh zenhumany pow2clk blizmax wangganggreat jaebaek nick10323 joshgarde p-brain wangwangwang1978 kanpole luco2018 joaquin78 rvengine jatobu alwaystoolate zhangjitao surfndez oscargame

directxshadercompiler's Issues

dxcompiler produces raw bitcode module when validation is disabled

When validation is disabled (via /Vd on the dxc command line) the output is a plain llvm bitcode module instead of a dxil container. I would expect the output to be a dxil container with unsigned dxil inside.

λ dxc  t.hlsl /Fo t.fxc /Tps_5_0
λ file t.fxc
t.fxc: data
λ Powershell -Command "Get-Content t.fxc -Encoding Byte -TotalCount 4 | Format-Hex"
  
           Path:
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000   44 58 42 43                                      DXBC
  
 
λ dxc /Vd t.hlsl /Fo t2.fxc /Tps_5_0
λ file t2.fxc
t2.fxc: LLVM IR bitcode
λ Powershell -Command "Get-Content t2.fxc -Encoding Byte -TotalCount 4 | Format-Hex"
 
           Path:
           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000   42 43 C0 DE                                      BCÀÞ

hctstart should include python if missing

It already does the work for most other installed dependencies, but it seems python is missing.

Strange #include/file system behaviour

I'm getting some strange behaviour with #includeing files from the file system using the dxc commandline app, which are somwhat hard to describe.

First, the repro:

First, download this file, rename .zip (github doesn't like zip files apparently), and extract somewhere. OddIncludes.txt
The repro relies on using absolute paths, so I can't just give a command line to run directly, but assuming the zip was extracted to c:\temp, run the following command (assumes dxc.exe is on the path):

dxc /T ps_6_0 c:\temp\OddIncludes\main.hlsl /I c:\temp

This should generate an error:

c:\temp\OddIncludes\main.hlsl:1:10: fatal error: 'include.hlsl' file not found
#include "include.hlsl"
         ^

This is erroneous -- include.hlsl is in the same directory as main.hlsl, so this should work fine. However, you can "fix" this error by changing the case of the directories. So this command works (note capital T):

dxc /T ps_6_0 c:\temp\OddIncludes\main.hlsl /I c:\Temp

Ok, so now the explanation. Apologies if this is very confusing! The first thing to note is that the (in this case superfluous) additional include directory specification is critical to the repro. Additional include directories are process before any source files (this matters because the file system has lots of caching). When it tries to get the file info for this path, it ends up in DxcArgsFileSystem::CreateFileW. That code does a substring search of the path against the input filename. If it matches (in this case it does), it assumes it is the parent of the source and returns the special handle SourceParentDirHandle. This is added to the cache with the path c:\temp.

Later on, we come to look up the directory for including the file. This is also a parent of the source file, and so also matches the substring match, and so also returns SourceParentDirHandle. This is then a cache hit, so instead of returning the directory asked for, it returns c:\temp again. Include.hlsl file is not found there, giving the error.

Now, the reason changing the case "fixes" things is that all the substring comparisons in dxcompilerobj.cpp are case sensitive! So by changing the case, the substring comparisons fail, and it falls through to just looking up the file in the file system, and things work correctly.

Hopefully that made some sense! Basically, I think the logic surrounding SourceParentDirHandle is broken. I do not know the best way to fix it, though, I'm afraid.

Include mechanism inherently tied to file system

The new include file handling mechanism (i.e. IDxcIncludeHandler) seems to have significantly less expressive power than the old ID3DInclude. Specifically, it receives already-resolved absolute file system paths, not the raw paths specified in the source file.

Our shaders use virtual paths that operate using our internal virtual file system, and do not necessarily correspond to real file system paths. To use the new interface we will have to somehow invent dummy include search paths and reverse-map them to virtual paths. Previously we just got the raw path and passed it to the virtual file system directly.

Another common requirement is mapping "special" include names to files that only exist in-memory (e.g. they have been generated by some internal process). I believe Unreal does this currently, for example. This could be hacked in using the current mechanism by stripping the path information and just looking at the filename, but it seems ugly and error prone to have to do it that way.

SV_Position must be float4

Using min16float4 type as SV_Position results in validation error. Legacy compiler allowed using min precision with this semantic. Is the new behavior expected?

min16float4 main(min16float4 position : SV_Position, min16float2 texcoord : Texcoord) : SV_Target
{
 min16float r = texcoord.r * 0.5;
 min16float g = texcoord.g + 0.5;
 return min16float4(r, g, 0, 1);
}

Setting up continuous integration

It would be nice to have a bot to automatically check pull requests to make sure they are not breaking the master branch. Appveyor is quite popular on GitHub as a Windows continuous integration service, and it is already used by vscode. Is it possible to enable Appveyor for this project? Or actually there are other plans for continuous integration?

cc @dneto0 @ehsannas

Bad codegen for firstbithigh and firstbitlow

We generate the wrong code for firstbithigh and inefficient code for firstbitlow.

firstbitlow

The hlsl firstbitlow function returns the first bit set from the lsb. If no bit is set it returns -1.

The dxil FirstbitLo intrinsic returns the first bit set from the lsb. If no bit is set it returns -1.

For firstbitlow, we generate this

  %FirstbitLo = call i32 @dx.op.unaryBits.i32(i32 32, i32 %0)  ; FirstbitLo(value)
  %1 = icmp ne i32 %0, 0
  %2 = select i1 %1, i32 %FirstbitLo, i32 -1

Which seems reasonable, but the select is redundant. If the value is 0 then the select will choose -1. But if the value is 0 the FirstbitLo intrinsic will return -1. So regardless of the input value we can take the result of the intrinsic. This would match the code produced by fxc.

firstbithigh

The hlsl firstbithigh function changes behavior depending on the sign. For unsigned values it returns the index of the first bit set from the msb. However, the index starts from the lsb. For signed values if the value is negative it returns the index of first 0 from the msb, otherwise it returns the index of the first 1. Again all indexes relative to the lsb. If no 1 is found (or 0 for signed) then -1 is returned.

The dxil FirstbitHi intrinsic returns the first bit set from the msb. If no bit is set it returns -1. The index is relative to the msb.

The dxil FirstbitSHi if the value is negative it returns the first 0 found from the msb, or -1 if no 0 is found. If the value is positive it returns the first 1 found or -1 if no 1 is found. The index is relative to the msb.

The codegen for unsigned firstbithigh looks ok:

  %FirstbitHi = call i32 @dx.op.unaryBits.i32(i32 33, i32 %0)  ; FirstbitHi(value)
  %1 = sub i32 31, %FirstbitHi
  %2 = icmp ne i32 %0, 0
  %3 = select i1 %2, i32 %1, i32 -1

The index returned from the FirstbitHi intrinsic is offset by 31 to produce an index based on the lsb. This matches the codegen from fxc.

The codegen for signed firstbithigh looks wrong:

  %FirstbitSHi = call i32 @dx.op.unaryBits.i32(i32 34, i32 %0)  ; FirstbitSHi(value)
  %1 = sub i32 31, %FirstbitSHi
  %2 = icmp ne i32 %0, 0
  %3 = select i1 %2, i32 %1, i32 -1

The problem is that it only handles the "not-found" case for positive numbers. For a negative number with no 0 bits (i.e. 0xffffffff for int) the codegen will produce a value of 32 instead of -1 (because 31-(-1) == 32). Instead of checking the input value for 0, we should check the value returned by the intrinsic for -1 and use that comparison for the select. This would match what fxc does.

Assert indexing vector in input struct

I'm getting an assert "Calling a function with a bad signature!" that is very confusing, but seems to be related to indexing into a vector that's inside a struct that's passed as input to the shader. I managed to boil it down to the following repro (compile with /T ps_6_0):

struct Input
{
    float2 v : TEXCOORD0;
};

float4 main(Input input) : SV_Target
{
    return input.v[0];
}

ExecutionTest::UnaryFloatOpTest#atan expects non-NaN result for NaN input

The reference values for ExecutionTest::UnaryFloatOpTest#atan expects a NaN input to produce a non-NaN output:

[ShaderOpArithTable.xml]

        <Parameter Name="Validation.Input">
          <Value>NaN</Value>
          ...
        </Parameter>
        <Parameter Name="Validation.Expected">
          <Value>0.785410</Value>
          ...
        </Parameter>

According to the DXIL spec, Atan of NaN should produce NaN.

Doubts about int64 DXIL support?

Hi,
don't know if it's correct place to ask, but anyway seeing compiler supports int64 in addition to expected wave ops in shader model 6, I'm asking about expected ETA for support for int64?
it's a Shader model 6.0 feature or will be part of Shader model 6.1,6.5,7 or whatever comes nexts..
also will be in creators update coming this spring or will be later shipping like Redstone 3 toward end of the year?
thanks..

License header has been rewritten on files

For example:
https://github.com/Microsoft/DirectXShaderCompiler/blob/master/utils/PerfectShuffle/PerfectShuffle.cpp

vs.

https://github.com/llvm-mirror/llvm/blob/master/utils/PerfectShuffle/PerfectShuffle.cpp

This seems like a violation of the original LLVM license.

LLVM assert when using the [unroll] attribute

I'm hitting an LLVM assert when compiling a shader that uses the parameterized version of the unroll attribute, where the loop count comes from a local constant variable. If I go past the assert the compiler crashes shortly after trying to deference a null pointer. Here's a simple example vertex shader that triggers the assert:

float4 VSMain() : SV_Position
{
    float4 position = 0.0f;

    const uint loopCount = 4;

    [unroll(loopCount)]
    for(uint i = 0; i < loopCount; ++i)
        position.x += 1.0f;

    return position;
}

This is the callstack that I get when I hit the assert:

dxcompiler.dll!llvm_assert(const char * _Message, const char * _File, unsigned int _Line) Line 17   C++
dxcompiler.dll!llvm::isa_impl_cl<clang::VarDecl,clang::NamedDecl const * __ptr64>::doit(const clang::NamedDecl * Val) Line 95   C++
dxcompiler.dll!llvm::isa_impl_wrap<clang::VarDecl,clang::NamedDecl const * __ptr64,clang::NamedDecl const * __ptr64>::doit(const clang::NamedDecl * const & Val) Line 123   C++
dxcompiler.dll!llvm::isa_impl_wrap<clang::VarDecl,clang::NamedDecl * __ptr64 const,clang::NamedDecl const * __ptr64>::doit(clang::NamedDecl * const & Val) Line 115 C++
dxcompiler.dll!llvm::isa<clang::VarDecl,clang::NamedDecl * __ptr64>(clang::NamedDecl * const & Val) Line 135    C++
dxcompiler.dll!llvm::dyn_cast<clang::VarDecl,clang::NamedDecl>(clang::NamedDecl * Val) Line 299 C++
dxcompiler.dll!ValidateAttributeIntArg(clang::Sema & S, const clang::AttributeList & Attr, unsigned int index) Line 9398    C++
dxcompiler.dll!HandleUnrollAttribute(clang::Sema & S, const clang::AttributeList & Attr) Line 9636  C++
dxcompiler.dll!hlsl::ProcessStmtAttributeForHLSL(clang::Sema & S, clang::Stmt * St, const clang::AttributeList & A, clang::SourceRange Range, bool & Handled) Line 9969 C++
dxcompiler.dll!ProcessStmtAttribute(clang::Sema & S, clang::Stmt * St, const clang::AttributeList & A, clang::SourceRange Range) Line 212   C++
dxcompiler.dll!clang::Sema::ProcessStmtAttributes(clang::Stmt * S, clang::AttributeList * AttrList, clang::SourceRange Range) Line 229  C++
dxcompiler.dll!clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt *,32> & Stmts, bool OnlyStatement, clang::SourceLocation * TrailingElseLoc) Line 116    C++
dxcompiler.dll!clang::Parser::ParseCompoundStatementBody(bool isStmtExpr) Line 1025 C++
dxcompiler.dll!clang::Parser::ParseFunctionStatementBody(clang::Decl * Decl, clang::Parser::ParseScope & BodyScope) Line 1962   C++
dxcompiler.dll!clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator & D, const clang::Parser::ParsedTemplateInfo & TemplateInfo, clang::Parser::LateParsedAttrList * LateParsedAttrs) Line 1174  C++
dxcompiler.dll!clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec & DS, unsigned int Context, clang::SourceLocation * DeclEnd, clang::Parser::ForRangeInit * FRI) Line 2277   C++
dxcompiler.dll!clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange & attrs, clang::ParsingDeclSpec & DS, clang::AccessSpecifier AS) Line 962 C++
dxcompiler.dll!clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange & attrs, clang::ParsingDeclSpec * DS, clang::AccessSpecifier AS) Line 978   C++
dxcompiler.dll!clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange & attrs, clang::ParsingDeclSpec * DS) Line 836  C++
dxcompiler.dll!clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef> & Result) Line 606    C++
dxcompiler.dll!clang::ParseAST(clang::Sema & S, bool PrintStats, bool SkipFunctionBodies) Line 146  C++
dxcompiler.dll!clang::ASTFrontendAction::ExecuteAction() Line 556   C++
dxcompiler.dll!clang::CodeGenAction::ExecuteAction() Line 747   C++
dxcompiler.dll!clang::FrontendAction::Execute() Line 468    C++
dxcompiler.dll!DxcCompiler::Compile(IDxcBlob * pSource, const wchar_t * pSourceName, const wchar_t * pEntryPoint, const wchar_t * pTargetProfile, const wchar_t * * pArguments, unsigned int argCount, const DxcDefine * pDefines, unsigned int defineCount, IDxcIncludeHandler * pIncludeHandler, IDxcOperationResult * * ppResult) Line 2204  C++

It compiles fine if I use the un-parameterized "[unroll]" attribute, or if I remove the unroll entirely.

Suffix for half literals?

Using 'h' character as suffix for half literals produces this error:

error: invalid suffix 'h' on floating constant

Is this a bug or half precision suffix is not going to be officially supported?

Cheers.

ShaderOpTest.{cpp,h} does not support hardware devices

The DXIL tests in clang-hlsl-tests.dll do not all support running on hardware (without source modifications).

ShaderOpTest.h:

 bool UseWarpDevice = true;

ShaderOpTest.cpp:

    if (m_pShaderOp->UseWarpDevice) {                                                                         
      CComPtr<IDXGIAdapter> warpAdapter;                                                                      
      CHECK_HR(factory->EnumWarpAdapter(IID_PPV_ARGS(&warpAdapter)));                                         
      CHECK_HR(D3D12CreateDevice(warpAdapter, FeatureLevelRequired,                                           
                                 IID_PPV_ARGS(&pDevice)));                                                    
    } else {                                                                                                  
      CComPtr<IDXGIAdapter1> hardwareAdapter;                                                                 
      GetHardwareAdapter(factory, m_pShaderOp->AdapterName, &hardwareAdapter);                                
      if (hardwareAdapter == nullptr) {                                                                       
        CHECK_HR(HRESULT_FROM_WIN32(ERROR_NOT_FOUND));                                                        
      }                                                                                                       
      CHECK_HR(D3D12CreateDevice(hardwareAdapter, FeatureLevelRequired,                                       
                                 IID_PPV_ARGS(&pDevice)));                                                    
    }

There are no other accesses (in particular, writes) to UseWarpDevice, meaning all runs will use WARP, regardless of hardware support.

Dxil documentation updates

Three updates to the docs that would be useful:

The FMad intrinsics is not a "fused" instruction (i.e. there is a rounding step between the multiply and the add). Only the Fma instruction is a fused operation.
The Log intrinsic is log base 2 unlike the hlsl log function which is log base e.
The Exp intrinsic is base 2 (i.e. 2^x) unlike the hlsl log function which is base e.

MIT license still referenced in repo

The MIT license is still in the repo root and referenced in the readme. For consistency sake that should be removed unless the MIT license is compatible with the existing LLVM license. It would be better overall if it was using a single license though.

Support for sampler2D, tex2D()

The current DirectXShaderCompiler code contains references and definitions for the legacy texture sampling syntax using sampler2D, tex2D, etc.

However, the shader fails to compile if we use sampler2D / tex2D methods. Is this functionality intentionally unplugged? Can we expect this syntax to be supported by the new compiler?

FAbs and flushing denorms

ExecutionTest::UnaryFloatOpTest#FAbs is expecting denorms to be preserved:

        <Parameter Name="Validation.Input">
          ...
          <Value>-denorm</Value>
          ...
        </Parameter>
        <Parameter Name="Validation.Expected">
          ...
          <Value>denorm</Value>
          ...
        </Parameter>

The DXIL spec does not mention whether denorms must be preserved for FAbs, though most operations do specify flush to zero behavior.

Textures inside structs

The new compiler doesn't support textures declared in structs (the stock DX one used to):

// Search defined structure for resource objects and fail
if (IsResourceInType(CGM.getContext(), constDecl->getType())) {
DiagnosticsEngine &Diags = CGM.getDiags();
unsigned DiagID = Diags.getCustomDiagID(
DiagnosticsEngine::Error,
"object types not supported in global aggregate instances, cbuffers, or tbuffers.");
Diags.Report(constDecl->getLocation(), DiagID);
return;
}

I wonder if you can roll out a change for this either into the dxil or into a separate branch?

Build scripts misbehave when relative paths are used in hctstart

Running hctstart with relative paths, for example:

hctstart  .  .\bin

Has put my environment into a state which causes the hctbuild script to exit immediately with no output. Some quick printf debugging shows it failing in line 10, during the check for missing source directory. Using absolute paths works as intended.

Intrinsics with different signatures share the same opcode class

While making hctdb compatible with python3 (#173), I fixed a bug in how we check for instruction classes that have different signatures. The fix triggered an assertion indicating we have a problem:

AssertionError: overload signature WavePrefixOp for instruction $oi32$oi8i8 differs from WavePrefixBitCount in i32i32i1

Indeed if I check the intrinsic definitions, they use the same class but take different operands:

        self.add_dxil_op("WavePrefixOp", next_op_idx, "WavePrefixOp", "returns the result of the operation on prior lanes", "hfd8wil", "", [
            db_dxil_param(0, "$o", "", "operation result"),
            db_dxil_param(2, "$o", "value", "input value"),
            db_dxil_param(3, "i8", "op", "0=sum,1=product", enum_name="WaveOpKind", is_const=True),
            db_dxil_param(4, "i8", "sop", "sign of operands", enum_name="SignedOpKind", is_const=True)])

        self.add_dxil_op("WavePrefixBitCount", next_op_idx, "WavePrefixOp", "returns the count of bits set to 1 on prior lanes", "v", "", [
            db_dxil_param(0, "i32", "", "operation result"),
            db_dxil_param(2, "i1", "value", "input value")])

dxc fails to compile InterlockedAdd intrinsic

The below program compiles with fxc but fails with error: Invalid operation on typed buffer on dxc.


// RUN: dxc /Tcs_5_1 /DDX12  %s

struct Foo {
    int a;
    int b;
    int c;
    int d;
};

Buffer<Foo> inputs : register(t1);
RWBuffer< int > g_Intensities : register(u1);

groupshared Foo sharedData;

#ifdef DX12
[RootSignature("DescriptorTable(UAV(u1, numDescriptors=1), SRV(t1), visibility=SHADER_VISIBILITY_ALL)")]
#endif
[ numthreads( 64, 2, 2 ) ]
void main( uint GI : SV_GroupIndex)
{
	sharedData = inputs[GI];
	int rtn;
	InterlockedAdd(sharedData.d, g_Intensities[GI], rtn);
	g_Intensities[GI] = rtn + sharedData.d;
}

Interest on DXIL<->SPIRV translation..

I know DXIL is still in it's infancy but would be nice if we could get DXIL from existing SPIRV shaders and viceversa SPIRV from DXIL shaders..
any interest by Microsoft on writing should a bidirectional shader translator or better to ask Khronos guys?..
thanks..

TGSM race conditions are flagged as errors

In commit #48 we implemented checking for race conditions, but these are flagged as errors. I think the old compiler flagged these as warnings. We should keep that same behavior.

compiling the below program gives the error:

error: validation errors
t.hlsl:21:15 Race condition writing to shared memory detected, consider making this write conditional

// RUN: dxc /Tcs_5_1 /DDX12  %s

struct Foo {
    int a;
    int b;
    int c;
    int d;
};

Buffer<Foo> inputs : register(t1);
RWBuffer< int > g_Intensities : register(u1);

groupshared Foo sharedData;

#ifdef DX12
[RootSignature("DescriptorTable(UAV(u1, numDescriptors=1), SRV(t1), visibility=SHADER_VISIBILITY_ALL)")]
#endif
[ numthreads( 64, 2, 2 ) ]
void main( uint GI : SV_GroupIndex)
{
	sharedData = inputs[GI];
	int rtn;
	InterlockedAdd(sharedData.d, g_Intensities[GI], rtn);
	g_Intensities[GI] = rtn + sharedData.d;
}

ShaderOp unit tests still force Warp adapter

RunShaderOpTest function is ignoring the pDevice passed in as a parameter. This caused it to create its own device based on the UseWarpDevice flag, which was also never overridden from its default value of true. So ShaderOp tests were always testing on warp devices through the ExecutionTest unit test.

dxc from dxil-1.0 branch has incorrect input register mapping

For a pixel shader that reads two texture coordinates:

void ps_main
(
  noperspective float2 tex0 : TEXCOORD0,
  noperspective float4 tex1 : TEXCOORD1,
  out float4 color : SV_Target
)
{
  color = tex0.xxyy + tex1;
}

I am seeing the following input signature in DXIL when using dxc from dxil-1.0:

;
; Input signature:
;
; Name                 Index   Mask Register SysValue  Format   Used
; -------------------- ----- ------ -------- -------- ------- ------
; TEXCOORD                 1   xyzw        0     NONE   float       
; TEXCOORD                 0   xy          1     NONE   float       
;

Notice how index 1 maps to register 0, and index 0 maps to register 1.

If I compile the same shader using the DXIL 0.7 front-end, I see:

;
; Input signature:
;
; Name                 Index   Mask Register SysValue  Format   Used
; -------------------- ----- ------ -------- -------- ------- ------
; TEXCOORD                 0   xy          0     NONE   float       
; TEXCOORD                 1   xyzw        1     NONE   float       
;

which is what I would expect.

Multithreaded shader compilation

Is it expected behavior, that using IDxcCompiler::Compile in multiple threads on the same IDxcCompiler object results in crash?

Stack trace:

dxcompiler.dll!llvm::PassRegistry::registerPass(const llvm::PassInfo & PI, bool ShouldFree) Line 82
dxcompiler.dll!initializeReducibilityAnalysisPassOnce(llvm::PassRegistry & Registry) Line 235
dxcompiler.dll!llvm::initializeReducibilityAnalysisPass(llvm::PassRegistry & Registry) Line 235
dxcompiler.dll!llvm::IsReducible(const llvm::Module & M, llvm::IrreducibilityAction Action) Line 245
dxcompiler.dll!hlsl::ValidateFlowControl(hlsl::ValidationContext & ValCtx) Line 3850
dxcompiler.dll!hlsl::ValidateDxilModule(llvm::Module * pModule, llvm::Module * pDebugModule) Line 3955
dxcompiler.dll!DxcValidator::RunValidation(IDxcBlob * pShader, llvm::Module * pModule, llvm::Module * pDebugModule, hlsl::AbstractMemoryStream * pDiagStream) Line 259
dxcompiler.dll!DxcValidator::ValidateWithOptModules(IDxcBlob * pShader, unsigned int Flags, llvm::Module * pModule, llvm::Module * pDiagModule, IDxcOperationResult * * ppResult) Line 133
dxcompiler.dll!DxcCompiler::Compile(IDxcBlob * pSource, const wchar_t * pSourceName, const wchar_t * pEntryPoint, const wchar_t * pTargetProfile, const wchar_t * * pArguments, unsigned int argCount, const DxcDefine * pDefines, unsigned int defineCount, IDxcIncludeHandler * pIncludeHandler, IDxcOperationResult * * ppResult) Line 2116

Repro gist:
https://gist.github.com/michalo2882/75691893f307162f3b40548d8d87f036

Interactive Samples

Are there any samples available that not only compile the shaders, using the experimental compiler, but also run them (e.g., a simple 'Hello Triangle ps_6_0' would suffice). It's a little bit hard to make it through the LLVM code and scarce docs.

constness violation with modify-assign operators

I used this online interface, which uses the latest DirectX Shader Compiler as far as I know, to verify my assumption: tryhlsl.azurewebsites.net/

The following shader code should output a compiler error, but instead the modify-assign operation is ignored by the compiler:

float4 PSMain(PSInput input) : SV_TARGET
{
    const float c = 2.0;
    c += 3.0; // ERROR (ignored by the compiler)
    return (float4)c;
}

While the code above is compiled, and the output value is a constant 2.0 (which is obviously wrong),
the following code is rejected by the compiler with an error output:

float4 PSMain(PSInput input) : SV_TARGET
{
    const float c = 2.0;
    c = c + 3.0; // ERROR (correctly rejected by the compiler)
    return (float4)c;
}

This bug is present in the fxc and also in the new dxc

D3DCompile bridge ignore include file handler

The compatibility bridge for D3DCompile (and friends) ignore the include file handler passed to them (i.e. ID3DInclude *pInclude passed to BridgeD3DCompile. See CompileFromBlob -- the pInclude parameter is unreferenced.

This means that we can't use this layer at all, because we rely on custom include file resolution to find our source files.

Is this a deliberate omission, or is it just something that hasn't been gotten to yet?

if-condition not restricted to scalar

I'm not sure if this is already fixed, but it seems that dxc allows vector and matrix expressions in if-conditions.
However, in the old fxc this is only allowed within the ternary operator '?:' like this:

float4 PS() : SV_Target
{
    float2 a = float2(1, 2);
    float2 b = float2(2, 1);

    // Returns 'float4(3, 9, 3, 9)'
    return (a < b ? 3 : 9).xyxy;
}

For the ternary operator this seems quite reasonable, but for a standard if-condition it doesn't make much sense and is rather confusing:

float4 PS() : SV_Target
{
    float2 a = float2(1, 2);
    float2 b = float2(2, 1);

    // This should be rejected, but is silently evaluated to 'true'
    if (a < b)
        return 3;
    else
        return 9;
}

Greetings,
Lukas

Validation error with dynamic loops, array access and swizzling

I'm hitting validation errors with a somewhat-obscure combination of dynamic looping, vector swizzles and dynamic array indexing. I managed to reduce the repro to the following:

float4 main() : SV_Target
{
    float array[2];
    float4 vec = 0;

    for (uint i = 0; i < 1; ++i)
    {
        array[i] = vec.xyzw[i];
    }

    return array[0];
}

Compile with /T ps_6_0 /Gfp. The /Gfp (prefer flow control) is essential -- it compiles fine without. Error is:

error: validation errors
Vector type '<4 x float>' is not allowed

Validation failed.

Call to f32tof16 with min16float2 value crashes compiler

It will compile with the explicit cast added.

struct PSInput
{
    float4 position : SV_POSITION;
    float4 color : COLOR;
};

float4 PSMain(PSInput input) : SV_TARGET
{
    min16float2 v = min16float2(input.color.xy);
    uint2 u = f32tof16(v); // <-- Problem is calling f32tof16 without explicitly casting v to float2
    //uint2 u = f32tof16(float2(v)); // <-- This works.
    return input.color + u.xyxy;
}

Add SPIR-V codegen

We are Google engineers working on Vulkan/SPIR-V developer tools. We would like to extend DirectXShaderCompiler so it can compile HLSL into SPIR-V for Vulkan.

HLSL is the prevalent shading language nowadays and DirectXShaderCompiler is the reference compiler for HLSL. Therefore we believe this effort will benefit the general graphics ecosystem.

The general approach is to directly translate frontend AST into SPIR-V words. Main components involved include:

SPIR-V Builder classes
SPIR-V codegen from AST
Parser support for key Vulkan concepts
- E.g., descriptor bindings and specialization constants
AST semantic analysis for Vulkan
- E.g., checking constraints and possibly transforming the AST to map to Vulkan semantics
Associated tests

This does not reflect the order of implementation. Currently we have the basic SPIR-V infrastructure (SPIR-V builder classes, ASTFrontendAction, and command line integration) ready and will upstream them soon.

For more details about the logistics and design choices, please see docs/SPIR-V.rst.

cc @dneto0 @ehsannas @jfroy

strange error output on recursive function calls

Recursive calls are not allowed in shading languages, but the compiler should report a meaningful error in that case.

I use the following example code with the online HLSL compiler (tryhlsl.azurewebsites.net):

float4 f(); // Forward declaration
float4 g() { return f(); } // Recursive call to 'f'
float4 f() { return g(); } // First call to 'g'

float4 VS() : SV_Position
{
    return f(); // First call to 'f'
}

The error output of the old HLSL compiler with target "vs_5_1" is:

D:\Windows\system32\unknown(4,8-10): error X3500: 'f': recursive functions not allowed in vs_5_1

But the error output of the new HLSL compiler with target "vs_6_0" is:

error: validation errors
Function �?f@@YA?AV?$vector@M$03@@XZ.flat with parameter is not permitted, it should be inlined
Not all elements of output SV_Position were written
Not all elements of SV_Position were written

I assume that this is not the intended output.

Crash when using "a * (b)" expressions

The following snippet is crashing the compiler. Issue doesn't occur when we remove the parenthesis from the mul expression.

RWByteAddressBuffer rawBuffer;

[numthreads(1, 1, 1)]
void main()
{
    rawBuffer.Store(4 * (0), 111);
}

IsFinite and NaN

ExecutionTest::UnaryFloatOpTest#Isfinite is expecting an NaN input to produce "1" (e.g. value is finite):

        <Parameter Name="Validation.Input">
          <Value>NaN</Value>
          ...
        </Parameter>
        <Parameter Name="Validation.Expected">
          <Value>1</Value>
          ...
        </Parameter>

The DXIL spec does not state explicitly whether NaN is finite or not, though NaN is traditionally not a finite value: cmath reference.

`ID3D12ShaderReflectionVariable::GetType` always returns `NULL`

Title

ID3D12ShaderReflectionVariable::GetType always returns NULL

Functional impact

Applications that work with d3dcompiler_47.dll, and which rely on being able to perform reflection on the contents of constant buffers (e.g., to find the offsets of sub-fields) do not work properly with dxcompiler.dll and may crash with a NULL pointer dererence (as they will likely assume that all valid ID3D12ShaderReflectionVariables will have valid types).

Minimal repro steps

Use D3DCompile() to compile a shader with at least one variable in a constant buffer
Use D3DReflect() to get an ID3D12ShaderReflection interface
Use ID3D12ShaderReflection::GetConstantBufferByName() to get refleciton info for the constant buffer
Use ID3D12ShaderReflectionConstantBuffer::GetVariableByName() to get the reflection info for the variable
Call ID3D12ShaderReflectionVariable::GetType()

Expected result

The result of ID3D12ShaderReflectionVariable::GetType() should be non-NULL if the input was a valid variable.

Actual result

The result is NULL.

Further technical details

The root cause is pretty clear, arising from the following lines in DxilContainerReflection.cpp, inside CShaderReflectionConstantBuffer::Initialize():

// TODO: create reflection type.
CShaderReflectionType *pVarType = nullptr;

The variable pVarType never gets filled in, but is used in the intialization of the CShaderReflectionVariable:

Var.Initialize(this, &VarDesc, pVarType, pDefaultValue);

The basic problem is then that the feature marked as TODO hasn't been implemented.

I have an initial implementation of the feature in place, at least as far as required for my workload. I will be working toward turning it into a pull request some time this week, unless somebody else already has an implementation that is further along.

dxc compiles invalid program

The following program has a syntax error, but dxc compiles it anyway (it will assert in a debug build). It is missing an = after the A[] part.

float main(float i : I) : SV_Target {
    float A[] {
        1, 2, 3
    };
    return A[i];
}

FXC reports an error:

error X3000: syntax error: unexpected integer constant

Validation error: Pointer type bitcast must be have same size

I'm hitting a validation error with one of our shaders. Seems to be related to asfloat and dynamic branching. I reduced the repro to the the following. Compile with /T ps_6_0

uint uintConstant;

float4 main() : SV_Target
{
    float a[2];

    a[0] = asfloat(uintConstant);
    a[1] = 0;

    float b[2];

    [loop]  // Essential to repro
    for (int i = 0; i < 2; ++i)
    {
        b[i] = a[i];
    }

    return b[0];
}

Note that I don't have a workaround for this currently, so suggestions would be welcome (obviously in this case I can remove the [loop] but the actual shader uses "prefer flow control" and removing that would be highly invasive).

dxc implements mul(float3x3, float3) as mul(transpose(float3x3), float3)

Take the following simple repro case as an example:

static const float3x3 g_mat1 = {
    1, 2, 3,
    4, 5, 6,
    7, 8, 9,
};

void ps_main(float3 tex : TEXCOORD0, out float4 color : SV_Target) {
    color = float4(mul(g_mat1, tex), 0);
}

color.x should be computed as (1*tex.x + 2*tex.y + 3*tex.z). And this is exactly what fxc does:

dp3 o0.x, l(1.000000, 2.000000, 3.000000, 0.000000), v0.xyzx

Dxc, on the other hand, appears to transpose g_mat1 and then multiply:

  %0 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 0, i32 undef)  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  %1 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 1, i32 undef)  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  %2 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 2, i32 undef)  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  %3 = fmul fast float %1, 4.000000e+00
  %4 = fadd fast float %3, %0
  %5 = fmul fast float %2, 7.000000e+00
  %6 = fadd fast float %4, %5

This computes (tex.y*4 + tex.x + tex.z*7).

Assert compiling hull shader with patch constant function

I'm getting an assert when trying to compile a hull shader with a patch constant function specified: "must have function annotation for patch constant function". Here's a minimal repro case (compile with /T hs_6_0:

[domain("tri")]
[partitioning("fractional_odd")]
[outputtopology("triangle_cw")]
[outputcontrolpoints(3)]
[patchconstantfunc("hsConstantFunc")]
float4 main() : WORLDPOS
{
    return 0;
}

struct TessFactors
{ 
    float a[3] : SV_TessFactor;
    float i : SV_InsideTessFactor;
};
TessFactors hsConstantFunc()
{
    TessFactors output;
    output.a[0] = 0;
    output.a[1] = 0;
    output.a[2] = 0;
    output.i = 0;
    return output;
}

This same shader compiles successfully with fxc.

Sub-optimal buffer stores for partial vectors

Consider the following compute shader:

struct ElementTy {
  uint4 bar;
};

RWStructuredBuffer<ElementTy> g_Buffer : register(u0);

[numthreads(32, 1, 1)]
void cs_main(uint id : SV_DispatchThreadId) {
  g_Buffer[id].bar.w = 1;
}

In particular, only one scalar component of a vector is written.

The dxc front-end implements this by reading the full value of the vector, and then storing it back with one element replaced:

define void @cs_main() {
entry:
  %g_Buffer_UAV_structbuf = call %dx.types.Handle @dx.op.createHandle(i32 57, i8 1, i32 0, i32 0, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %0 = call i32 @dx.op.threadId.i32(i32 93, i32 0)  ; ThreadId(component)
  %BufferLoad = call %dx.types.ResRet.i32 @dx.op.bufferLoad.i32(i32 68, %dx.types.Handle %g_Buffer_UAV_structbuf, i32 %0, i32 0)  ; BufferLoad(srv,index,wot)
  %1 = extractvalue %dx.types.ResRet.i32 %BufferLoad, 0
  %2 = extractvalue %dx.types.ResRet.i32 %BufferLoad, 1
  %3 = extractvalue %dx.types.ResRet.i32 %BufferLoad, 2
  call void @dx.op.bufferStore.i32(i32 69, %dx.types.Handle %g_Buffer_UAV_structbuf, i32 %0, i32 0, i32 %1, i32 %2, i32 %3, i32 1, i8 15)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  ret void
}

This is sub-optimal because it has more than necessary memory traffic.

In contrast, the fxc front-end contains a single scalar buffer store:

cs_5_0
dcl_globalFlags refactoringAllowed
dcl_uav_structured u0, 16
dcl_input vThreadID.x
dcl_thread_group 32, 1, 1
store_structured u0.x, vThreadID.x, l(12), l(1)
ret

Target Android/iOS?

Seems like this is focused on targeting Windows platforms, but since it uses LLVM to generate the IR so is it possible to configure the compiler to target other platforms?

P.S. New to shaders but lots of mobile app development experience. Please forgive me if I'm missing something conceptually.

export keyword causes builds to fail

using export keyword in a shader (like this simple vertex shader)

export float4 applySomething(float4 p)
{
return p * float4(sin(p.x),0,0,1);
}

void VS(float4 p : POSITION, out float4 ps : SV_Position)
{
ps = applySomething(p);
}

Brings the following error.

test.hlsl:1:1: error: 'export' is a reserved keyword in HLSL
export float4 applySomething(float4 p)
^
test.hlsl:8:7: error: use of undeclared identifier 'applySomething'
ps = applySomething(p);
^

As far as I know, library is not supported (will it actually be?) at the moment, but that means any shader that was using this feature (or uses a function which is marked as export somewhere in the codebase) will fail compilation.

If there's no plan to support the library feature anymore, should export keyword just be silently ignored (with eventual warning as it is now with legacy annotations) ?

Crash when returning 64-bit types

Compile with /Tps_6_0

[RootSignature("")]
double main() : SV_Target {
    return 1;
}

Crash exists for int64_t as well. We should emit an error message about an illegal return type like fxc does.

SampleCmpLevelZero not supported in CS

CS shader using SampleCmpLevelZero won't compile, but it's supported by FXC:

// dxc /T cs_6_0 /E main test.hlsl
Texture2D<float> csInputTexture2D : register(t0);
RWTexture2D<float4> uavOutput : register(u0);
SamplerComparisonState samplerComparisonState : register(s6);

[numthreads(8, 8, 1)]
void main(uint3 dtID : SV_DispatchThreadID)
{
	float valX = csInputTexture2D.SampleCmpLevelZero(samplerComparisonState, float2(0.5, 0.5), 0.5);
	uavOutput[dtID.xy] = float4(valX, 0, 0, 1);
}

Crash: assinging function object to integer value

The following HLSL code is obviously invalid, but the compiler should of course output a comprehensive error report. However, it seems that the new compiler crashes but the old fxc reports the simple error message error X3005: 'f': identifier represents a function, not a variable:

int f() {
    return 1;
}
void main() {
    int x = f; // CRASH
}

I found this issue by testing how the new compiler parses function call expressions.
So I used the following code example, which is compiled successfully with the new compiler, but the old fxc reports the same error message like in the example above:

int f() {
    return 1;
}
void main() {
    ((f))();
}

So I assume that the new compiler handles call expressions by a separation of its function object (here f or ((f))) and the call itself with optional arguments (here ()).

Am I right with this assumption? Because I guess this comes from clang where function objects can be passed around without being used as call expressions.

Crash in Create*PipelineState with DXIL in recent insider builds

Since the release of the first insider build with sm6 support, a couple of new insider builds have been released.

I do not have the exact build that broke sm6 ( maybe one or two past the original support ), but i am now in a state where the call to EnableExperimental feature with the SM6 guid success, but the warp driver crash if i send a dxil bytecode to Create*PipelineState. And while i turned on the SM6 feature, the call still succeed if i send a DXBC shader.

Updated AMD,NV,Intel drivers info?

Hi,
both AMD and Nvidia released new branch drivers..
NV 381 drivers: support DXIL 1.0 instead of 0.7?
AMD 17.4.1 (17.10 branch) have "A later driver should have fixes for the wave routines"?

also installed recent Intel drivers on HD530 and no one has Shader Model 6.0 (DXIL) support and WDDM2.2..
can post info about Intel GPU drivers with DXIL support? coming soon?

Thanks..