Git Product home page Git Product logo

llvm-spirv-backend's Introduction

The LLVM Compiler Infrastructure

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called "LLVM". This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, or #llvm IRC channel on OFTC.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.

llvm-spirv-backend's People

Contributors

akyrtzi avatar arsenm avatar chandlerc avatar chapuni avatar d0k avatar ddunbar avatar douggregor avatar dwblaikie avatar echristo avatar eefriedman avatar espindola avatar fhahn avatar isanbard avatar jdevlieghere avatar klausler avatar labath avatar lattner avatar lebedevri avatar lhames avatar maskray avatar nico avatar nikic avatar rksimon avatar rnk avatar rotateright avatar rui314 avatar stoklund avatar tkremenek avatar topperc avatar zygoloid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llvm-spirv-backend's Issues

Missed OpExecutionMode

There is missed OpExecutionMode in 3 tests: ExecutionMode.ll, preprocess-metadata.ll, transcoding/ExecutionMode_SPIR_to_SPIRV.ll.

Support llvm arithmetic/bit intrinsics

We have 4 tests with unsupported llvm arithmetic/bit intrinsics. Most of them can be implemented substituting OpExtInst * or other single operations:

llvm-intrinsics/abs.ll (llvm.abs.* -> OpExtInst s_abs)
llvm-intrinsics/fp-intrinsics.ll  (llvm.*-> OpExtInst *)
llvm-intrinsics/fmuladd.ll  (llvm.fmuladd -> OpExtInst mad)
llvm-intrinsics/ctpop.ll (llvm.ctpop -> OpBitCount)

infinite recursion in SPIRV type creation

In transcoding/RecursiveType.ll and layout.ll we have recursive type definitions like this:

%struct.A = type { i32, %struct.C }
%struct.C = type { i32, %struct.B }
%struct.B = type { i32, %struct.A addrspace(4)* }
%struct.Node = type { %struct.Node addrspace(1)*, i32 }

it causes infinite loop in SPIRVTypeRegistry.cpp during SPIRV type creation for the structs. Stack fragment:

 #35 0x0000560c317caa7d llvm::SPIRVTypeRegistry::getOrCreateSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:391:70
 #36 0x0000560c317ca859 llvm::SPIRVTypeRegistry::createSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:365:53
 #37 0x0000560c317caa7d llvm::SPIRVTypeRegistry::getOrCreateSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:391:70
 #38 0x0000560c317c9c55 llvm::SPIRVTypeRegistry::getOpTypeStruct(llvm::StructType const*, llvm::MachineIRBuilder&) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:254:39
 #39 0x0000560c317ca615 llvm::SPIRVTypeRegistry::createSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:345:29
 #40 0x0000560c317caa7d llvm::SPIRVTypeRegistry::getOrCreateSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:391:70
 #41 0x0000560c317c9c55 llvm::SPIRVTypeRegistry::getOpTypeStruct(llvm::StructType const*, llvm::MachineIRBuilder&) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:254:39
 #42 0x0000560c317ca615 llvm::SPIRVTypeRegistry::createSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:345:29
 #43 0x0000560c317caa7d llvm::SPIRVTypeRegistry::getOrCreateSPIRVType(llvm::Type const*, llvm::MachineIRBuilder&, AccessQualifier::AccessQualifier) /export/users/idiyachk/spirv/LLVM-SPIRV-Backend-2/llvm/lib/Target/SPIRV/SPIRVTypeRegistry.cpp:391:70

Missed OpenCL builtin functions

There are more than 30 fails in our testing which relates to issues with builtin functions. Error message "LLVM ERROR: Cannot translate OpenCL built-in func:"

__spirv_FOrdGreaterThanEqual(float, float) in FOrdGreaterThanEqual_bool.ll
void* __spirv_SampledImage<ocl_image1d_ro, void*>(ocl_image1d_ro, ocl_sampler) in SampledImageRetType.ll
isgreaterequal(float, float) in FOrdGreaterThanEqual_int.ll 
__spirv_Select(bool, int, int) in TruncToBool.ll
vstorea_half4_rtp(float vector[4], unsigned long, decimal16 AS1*) in half_no_extension.ll
foo(ocl_image2d_ro) in mangled_function.ll
opencl.queue_t in opencl.queue_t.ll
signed char vector[4] __spirv_IsNan<signed char vector[4], float vector[4]>(float vector[4]) in relationals.ll
__spirv_Select(bool, int, int) in select.ll
BitReverse(int) in transcoding/BitReversePref.ll
__spirv_CreatePipeFromPipeStorage_write(__spirv_PipeStorage const AS1*) in transcoding/CreatePipeFromPipeStorage.ll
all(long vector[2]) in transcoding/OpAllAny.ll
get_fence(void AS4*) in transcoding/OpGenericPtrMemSemantics.ll
work_group_all(int) in /transcoding/OpGroupAllAny.ll
opencl.event_t in transcoding/OpGroupAsyncCopy.ll
get_image_array_size in transcoding/OpImageQuerySize.ll
atomic_cmpxchg in transcoding/OpenCL/atomic_cmpxchg.cl,
			   in transcoding/memory_access.ll
atomic_work_item_fence in transcoding/OpenCL/atomic_work_item_fence.cl
isequal(float vector[8], float vector[8]) in transcoding/isequal.ll
get_image_channel_data_type in transcoding/image_channel.ll
isfinite(float) in transcoding/relationals_float.ll
isinf(double vector[2]) in transcoding/relationals_double.ll
isfinite(decimal16) in transcoding/relationals_half.ll
__spirv_SpecConstant(int, bool) in transcoding/spec_const.ll
__spirv_SampledImage(__spirv_Image__float_1_1_0_0_0_0_0 const AS1*, __spirv_Sampler const AS1*) in transcoding/spirv-types.ll
sub_group_non_uniform_broadcast(char, unsigned int) in transcoding/sub_group_ballot.ll
sub_group_clustered_reduce_add(char, unsigned int) in transcoding/sub_group_clustered_reduce.ll
sub_group_broadcast(char, unsigned int) in transcoding/sub_group_extended_types.ll
sub_group_non_uniform_reduce_add(char) in transcoding/sub_group_non_uniform_arithmetic.ll
sub_group_elect() in transcoding/sub_group_non_uniform_vote.ll
sub_group_shuffle(char, unsigned int) in transcoding/sub_group_shuffle.ll
sub_group_shuffle_up(char, unsigned int) in transcoding/sub_group_shuffle_relative.ll
atomic_flag_test_and_set in transcoding/atomic_flag.cl

I hope the fails will be fixed in new OpenCl builtins implementation by Zain.

LLVM ERROR: Cannot generate OpenCL type: opencl.event_t

One lit test (transcoding/OpGroupAsyncCopy.ll) and at least 4 basic CTS tests (async_copy_local_to_global, async_strided_copy_global_to_local, async_strided_copy_local_to_global, async_copy_global_to_local) failed with:
LLVM ERROR: Cannot generate OpenCL type: opencl.event_t.

Do not keep "global" SPIR-V instructions in MachineFunctions

We probably can avoid keeping copies of "global" SPIR-V instructions in each MachineFunction, at least for some of them. For this purpose we need to implement a special cache for such "global" instructions in SPIRVGlobalRegister. For this implementation we need to find out how to

  • print the instructions from the cache to represent them in MachineFunction dumps,
  • what data structures should represent cached instructions (candidates so far are MCInst or new custom class),
  • how safely link cached instructions with other MachineInstrs in different MachineFunctions.
    I think we can do this for OpDecorate and OpName instructions, then also support Types/Consts/GVars/Funcs and others if applicable to them.

Assertion "Wrong MachineOperand accessor" failed (from SPIRVOpenCLBIFs.cpp)

There are 3 tests (transcoding/OpImageSampleExplicitLod.ll, transcoding/OpImageReadMS.ll, SamplerArgNonKernel.ll) with the fail:
const llvm::ConstantInt* llvm::MachineOperand::getCImm() const: Assertion `isCImm() && "Wrong MachineOperand accessor"' failed.

All three come from SPIRVOpenCLBIFs.cpp so they may be fixed in new OpenCl builtins implementation:

#10 0x000056284e9e1959 llvm::MachineOperand::getCImm() const llvm/include/llvm/CodeGen/MachineOperand.h:541:5
#11 0x000056284e9eeb79 getLiteralValueForConstant(llvm::Register, llvm::MachineRegisterInfo const*) llvm/lib/Target/SPIRV/SPIRVOpenCLBIFs.cpp:233:59
#12 0x000056284e9ef855 genSampledReadImage(llvm::MachineIRBuilder&, llvm::Register, llvm::MachineInstr const*, llvm::SmallVectorImpl<llvm::Register> const&, llvm::SPIRVTypeRegistry*) llvm/lib/Target/SPIRV/SPIRVOpenCLBIFs.cpp:424:47

double tracking of SampledImageType in spirv-types.ll

The issue was introduced in PR #47: SampledImageType can be registered both in DT and SpecialTypesAndConstsMap maps (spirv-types.ll test). This causes its processing by fillLocalAliasTables two times (SPIRVGlobalTypesAndRegNumPass.cpp).

The issue does not cause a fail in our tests but the behavior is logically wrong and should be fixed.

Missed OpConstantComposite support

We have this issue at least in two OpenCL CTS tests: basic/progvar_func_scope, basic/progvar_prog_scope_uninit. llvm outputs this error: LLVM ERROR: unable to write nop sequence of 36 bytes.

It's because our writeNopData() implementation always returns false. A simple solution would be to change it to return true. However the real issue is the backend does not properly handle constant initializers of composites. It just puts the data in ELF flavor like this (from basic/progvar_func_scop):

...
        OpName %15 "test_bump.persistent"
...
        %14 = OpConstant %2 15
        %15 = OpVariable %10 CrossWorkgroup
...
        .p2align        6                               ; @test_bump.persistent
test_bump.persistent:
        .byte   97                              ; 0x61
        .zero   63
        .long   0                               ; 0x0
        .long   1                               ; 0x1
        .long   2                               ; 0x2
        .long   3                               ; 0x3
        .long   4                               ; 0x4
        .long   5                               ; 0x5
        .long   6                               ; 0x6
        .long   7                               ; 0x7
        .long   8                               ; 0x8
        .long   9                               ; 0x9
        .long   10                              ; 0xa
        .long   11                              ; 0xb
        .long   12                              ; 0xc
        .long   13                              ; 0xd
        .long   14                              ; 0xe
        .long   0                               ; 0x0

We need to implement OpConstantComposite support like in Translator:

...
               OpName %test_bump_persistent "test_bump.persistent"
...
   %uchar_97 = OpConstant %uchar 97
     %uint_0 = OpConstant %uint 0
     %uint_1 = OpConstant %uint 1
     %uint_2 = OpConstant %uint 2
     %uint_3 = OpConstant %uint 3
     %uint_4 = OpConstant %uint 4
     %uint_5 = OpConstant %uint 5
     %uint_6 = OpConstant %uint 6
     %uint_7 = OpConstant %uint 7
     %uint_8 = OpConstant %uint 8
     %uint_9 = OpConstant %uint 9
    %uint_10 = OpConstant %uint 10
    %uint_11 = OpConstant %uint 11
    %uint_12 = OpConstant %uint 12
    %uint_13 = OpConstant %uint 13
    %uint_14 = OpConstant %uint 14
   %uchar_98 = OpConstant %uchar 98
   %uint_100 = OpConstant %uint 100
    %v16uint = OpTypeVector %uint 16
%struct_mystruct_t = OpTypeStruct %uchar %v16uint
...
        %21 = OpConstantComposite %v16uint %uint_0 %uint_1 %uint_2 %uint_3 %uint_4 %uint_5 %uint_6 %uint_7 %uint_8 %uint_9 %uint_10 %uint_11 %uint_12 %uint_13 %uint_14 %uint_0
        %23 = OpConstantComposite %struct_mystruct_t %uchar_97 %21
%test_bump_persistent = OpVariable %_ptr_CrossWorkgroup_struct_mystruct_t CrossWorkgroup %23

[SPIR-V] Broken translation of some struct types in opaque pointer mode

During the work on the patch D149679 , I noticed several potential issues with generating struct types in opaque pointer mode.

The problem is visible in block_w_struct_return.ll test. The SPIR-V backend generates the following code in opaque pointer mode:

	OpCapability Kernel
	OpCapability Int8
	OpCapability GenericPointer
	OpCapability Linkage
	OpExtension "SPV_KHR_no_integer_wrap_decoration"
	%1 = OpExtInstImport "OpenCL.std"
	OpMemoryModel Physical32 OpenCL
	OpEntryPoint Kernel %29 "block_ret_struct" %27 %25
	OpExecutionMode %29 ContractionOff
	OpExecutionMode %52 ContractionOff
	OpSource Unknown 0
	OpName %28 "res"
	OpName %29 "block_ret_struct"
	OpName %27 "__block_literal_global"
	OpName %30 "res.addr"
	OpName %31 "kernelBlock"
	OpName %32 "tid"
	OpName %33 "aa"
	OpName %34 "tmp"
	OpName %25 "__spirv_BuiltInGlobalInvocationId"
	OpName %37 "call"
	OpName %45 "sub"
	OpName %49 "agg.result"
	OpName %50 ".block_descriptor"
	OpName %51 "a"
	OpName %52 "__block_ret_struct_block_invoke"
	OpName %53 ".block_descriptor.addr"
	OpName %54 "block"
	OpDecorate %27 Constant
	OpDecorate %27 Alignment 4
	OpDecorate %25 Constant
	OpDecorate %25 LinkageAttributes "__spirv_BuiltInGlobalInvocationId" Import
	OpDecorate %25 BuiltIn GlobalInvocationId
	OpDecorate %45 NoSignedWrap
	OpDecorate %49 Alignment 4
	OpDecorate %49 FuncParamAttr NoAlias
	OpDecorate %51 Alignment 4
	%2 = OpTypeInt 8 0
	%3 = OpTypePointer CrossWorkgroup %2
	%4 = OpTypeVoid
	%5 = OpTypeFunction %4 %3
	%6 = OpTypeInt 32 0
	%7 = OpTypeVector %6 3
	%8 = OpTypePointer Input %2
	%9 = OpTypePointer Function %2
	%10 = OpTypePointer Generic %2
	%11 = OpTypeStruct %6 %6 %10
	%12 = OpTypeFunction %4 %9 %10 %9
	%13 = OpConstant %6 12 
	%14 = OpConstant %6 4 
	%15 = OpTypePointer Generic %2
	%16 = OpTypePointer Function %12
	%17 = OpConstantNull %16
	%18 = OpSpecConstantOp %15 121 %17
	%19 = OpConstantComposite %11 %13 %14 %18
	%20 = OpConstant %6 0 
	%21 = OpConstant %6 4294967295 
	%22 = OpConstant %6 5 
	%23 = OpConstant %6 6 
	%24 = OpTypePointer Function %8
	%25 = OpVariable %8 Input 
	%26 = OpTypePointer CrossWorkgroup %11
	%27 = OpVariable %26 CrossWorkgroup %19
	%29 = OpFunction %4 None %5             ; -- Begin function block_ret_struct
	%28 = OpFunctionParameter %3
	%58 = OpLabel
	%30 = OpVariable %9 Function 
	%31 = OpVariable %9 Function 
	%32 = OpVariable %9 Function 
	%33 = OpVariable %9 Function 
	%34 = OpVariable %9 Function 
	OpStore %30 %28 Aligned 4
	%35 = OpPtrCastToGeneric %10 %27
	OpStore %31 %35 Aligned 4
	%36 = OpLoad %7 %25 Aligned 1
	%37 = OpCompositeExtract %6 %36 0
	OpStore %32 %37 Aligned 4
	%38 = OpLoad %3 %30 Aligned 4
	%39 = OpLoad %6 %32 Aligned 4
	%40 = OpInBoundsPtrAccessChain %3 %38 %39
	OpStore %40 %21 Aligned 4
	%41 = OpInBoundsPtrAccessChain %9 %33 %20 %20
	OpStore %41 %22 Aligned 4
	%42 = OpFunctionCall %4 %52 %34 %35 %33
	%43 = OpInBoundsPtrAccessChain %9 %34 %20 %20
	%44 = OpLoad %6 %43 Aligned 4
	%45 = OpISub %6 %44 %23
	%46 = OpLoad %3 %30 Aligned 4
	%47 = OpLoad %6 %32 Aligned 4
	%48 = OpInBoundsPtrAccessChain %3 %46 %47
	OpStore %48 %45 Aligned 4
	OpReturn
	OpFunctionEnd
                                        ; -- End function
	%52 = OpFunction %4 None %12            ; -- Begin function __block_ret_struct_block_invoke
	%49 = OpFunctionParameter %9
	%50 = OpFunctionParameter %10
	%51 = OpFunctionParameter %9
	%59 = OpLabel
	%53 = OpVariable %9 Function 
	OpStore %53 %50 Aligned 4
	%54 = OpBitcast %10 %50
	%55 = OpInBoundsPtrAccessChain %9 %51 %20 %20
	OpStore %55 %23 Aligned 4
	%56 = OpBitcast %9 %49
	%57 = OpBitcast %9 %51
	OpCopyMemorySized %56 %57 %14 Aligned 4
	OpReturn
	OpFunctionEnd
                                        ; -- End function

While the Khronos SPIR-V Translator generates:

; SPIR-V
; Version: 1.4
; Generator: Khronos LLVM/SPIR-V Translator; 14
; Bound: 79
; Schema: 0
               OpCapability Addresses
               OpCapability Linkage
               OpCapability Kernel
               OpCapability GenericPointer
               OpCapability Int8
          %1 = OpExtInstImport "OpenCL.std"
               OpMemoryModel Physical32 OpenCL
               OpEntryPoint Kernel %75 "block_ret_struct" %__block_literal_global
               OpExecutionMode %75 ContractionOff
               OpSource Unknown 0
               OpName %__block_literal_global "__block_literal_global"
               OpName %_Z13get_global_idj "_Z13get_global_idj"
               OpName %block_ret_struct "block_ret_struct"
               OpName %res "res"
               OpName %entry "entry"
               OpName %res_addr "res.addr"
               OpName %kernelBlock "kernelBlock"
               OpName %tid "tid"
               OpName %struct_A "struct.A"
               OpName %aa "aa"
               OpName %tmp "tmp"
               OpName %call "call"
               OpName %arrayidx "arrayidx"
               OpName %a "a"
               OpName %__block_ret_struct_block_invoke "__block_ret_struct_block_invoke"
               OpName %agg_result "agg.result"
               OpName %_block_descriptor ".block_descriptor"
               OpName %a_0 "a"
               OpName %a1 "a1"
               OpName %sub "sub"
               OpName %arrayidx2 "arrayidx2"
               OpName %entry_0 "entry"
               OpName %_block_descriptor_addr ".block_descriptor.addr"
               OpName %block "block"
               OpName %a1_0 "a1"
               OpName %res_0 "res"
               OpDecorate %__block_literal_global Constant
               OpDecorate %__block_literal_global Alignment 4
               OpDecorate %_Z13get_global_idj LinkageAttributes "_Z13get_global_idj" Import
               OpDecorate %block_ret_struct LinkageAttributes "block_ret_struct" Export
               OpDecorate %res_addr Alignment 4
               OpDecorate %kernelBlock Alignment 4
               OpDecorate %tid Alignment 4
               OpDecorate %aa Alignment 4
               OpDecorate %tmp Alignment 4
               OpDecorate %agg_result FuncParamAttr NoAlias
               OpDecorate %agg_result FuncParamAttr Sret
               OpDecorate %agg_result Alignment 4
               OpDecorate %a_0 FuncParamAttr ByVal
               OpDecorate %a_0 Alignment 4
               OpDecorate %sub NoSignedWrap
               OpDecorate %_block_descriptor_addr Alignment 4
       %uint = OpTypeInt 32 0
      %uchar = OpTypeInt 8 0
    %uint_12 = OpConstant %uint 12
     %uint_4 = OpConstant %uint 4
     %uint_0 = OpConstant %uint 0
%uint_4294967295 = OpConstant %uint 4294967295
     %uint_5 = OpConstant %uint 5
     %uint_6 = OpConstant %uint 6
%_ptr_Generic_uchar = OpTypePointer Generic %uchar
  %_struct_8 = OpTypeStruct %uint %uint %_ptr_Generic_uchar
%_ptr_CrossWorkgroup__struct_8 = OpTypePointer CrossWorkgroup %_struct_8
         %12 = OpTypeFunction %uint %uint
       %void = OpTypeVoid
%_ptr_CrossWorkgroup_uchar = OpTypePointer CrossWorkgroup %uchar
         %17 = OpTypeFunction %void %_ptr_CrossWorkgroup_uchar
%_ptr_Function__ptr_CrossWorkgroup_uchar = OpTypePointer Function %_ptr_CrossWorkgroup_uchar
%_ptr_Function__ptr_Generic_uchar = OpTypePointer Function %_ptr_Generic_uchar
%_ptr_Function_uint = OpTypePointer Function %uint
   %struct_A = OpTypeStruct %uint
%_ptr_Function_struct_A = OpTypePointer Function %struct_A
%_ptr_Function_uchar = OpTypePointer Function %uchar
%_ptr_Generic__struct_8 = OpTypePointer Generic %_struct_8
%_ptr_Function__ptr_Generic__struct_8 = OpTypePointer Function %_ptr_Generic__struct_8
%_ptr_CrossWorkgroup_uint = OpTypePointer CrossWorkgroup %uint
%_ptr_Function__ptr_CrossWorkgroup_uint = OpTypePointer Function %_ptr_CrossWorkgroup_uint
         %51 = OpTypeFunction %void %_ptr_Function_struct_A %_ptr_Generic__struct_8 %_ptr_Function_struct_A
          %7 = OpConstantNull %_ptr_Generic_uchar
          %9 = OpConstantComposite %_struct_8 %uint_12 %uint_4 %7
%__block_literal_global = OpVariable %_ptr_CrossWorkgroup__struct_8 CrossWorkgroup %9
%_Z13get_global_idj = OpFunction %uint None %12
         %14 = OpFunctionParameter %uint
               OpFunctionEnd
%block_ret_struct = OpFunction %void None %17
        %res = OpFunctionParameter %_ptr_CrossWorkgroup_uchar
      %entry = OpLabel
   %res_addr = OpVariable %_ptr_Function__ptr_CrossWorkgroup_uchar Function
%kernelBlock = OpVariable %_ptr_Function__ptr_Generic_uchar Function
        %tid = OpVariable %_ptr_Function_uint Function
         %aa = OpVariable %_ptr_Function_struct_A Function
        %tmp = OpVariable %_ptr_Function_struct_A Function
         %31 = OpBitcast %_ptr_Function__ptr_CrossWorkgroup_uchar %res_addr
               OpStore %31 %res Aligned 4
         %33 = OpBitcast %_ptr_Function_uchar %kernelBlock
               OpLifetimeStart %33 4
         %35 = OpPtrCastToGeneric %_ptr_Generic__struct_8 %__block_literal_global
         %37 = OpBitcast %_ptr_Function__ptr_Generic__struct_8 %kernelBlock
               OpStore %37 %35 Aligned 4
         %38 = OpBitcast %_ptr_Function_uchar %tid
               OpLifetimeStart %38 4
       %call = OpFunctionCall %uint %_Z13get_global_idj %uint_0
               OpStore %tid %call Aligned 4
         %43 = OpBitcast %_ptr_Function__ptr_CrossWorkgroup_uint %res_addr
         %44 = OpLoad %_ptr_CrossWorkgroup_uint %43 Aligned 4
         %45 = OpLoad %uint %tid Aligned 4
   %arrayidx = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %44 %45
               OpStore %arrayidx %uint_4294967295 Aligned 4
         %48 = OpBitcast %_ptr_Function_uchar %aa
               OpLifetimeStart %48 4
          %a = OpInBoundsPtrAccessChain %_ptr_Function_uint %aa %uint_0 %uint_0
               OpStore %a %uint_5 Aligned 4
         %56 = OpFunctionCall %void %__block_ret_struct_block_invoke %tmp %35 %aa
         %a1 = OpInBoundsPtrAccessChain %_ptr_Function_uint %tmp %uint_0 %uint_0
         %58 = OpLoad %uint %a1 Aligned 4
        %sub = OpISub %uint %58 %uint_6
         %61 = OpBitcast %_ptr_Function__ptr_CrossWorkgroup_uint %res_addr
         %62 = OpLoad %_ptr_CrossWorkgroup_uint %61 Aligned 4
         %63 = OpLoad %uint %tid Aligned 4
  %arrayidx2 = OpInBoundsPtrAccessChain %_ptr_CrossWorkgroup_uint %62 %63
               OpStore %arrayidx2 %sub Aligned 4
         %65 = OpBitcast %_ptr_Function_uchar %aa
               OpLifetimeStop %65 4
         %66 = OpBitcast %_ptr_Function_uchar %tid
               OpLifetimeStop %66 4
         %67 = OpBitcast %_ptr_Function_uchar %kernelBlock
               OpLifetimeStop %67 4
               OpReturn
               OpFunctionEnd
%__block_ret_struct_block_invoke = OpFunction %void None %51
 %agg_result = OpFunctionParameter %_ptr_Function_struct_A
%_block_descriptor = OpFunctionParameter %_ptr_Generic__struct_8
        %a_0 = OpFunctionParameter %_ptr_Function_struct_A
    %entry_0 = OpLabel
%_block_descriptor_addr = OpVariable %_ptr_Function__ptr_Generic_uchar Function
         %70 = OpBitcast %_ptr_Function__ptr_Generic__struct_8 %_block_descriptor_addr
               OpStore %70 %_block_descriptor Aligned 4
      %block = OpBitcast %_ptr_Generic_uchar %_block_descriptor
       %a1_0 = OpInBoundsPtrAccessChain %_ptr_Function_uint %a_0 %uint_0 %uint_0
               OpStore %a1_0 %uint_6 Aligned 4
         %73 = OpBitcast %_ptr_Function_uchar %agg_result
         %74 = OpBitcast %_ptr_Function_uchar %a_0
               OpCopyMemorySized %73 %74 %uint_4 Aligned 4
               OpReturn
               OpFunctionEnd
         %75 = OpFunction %void None %17
      %res_0 = OpFunctionParameter %_ptr_CrossWorkgroup_uchar
         %77 = OpLabel
         %78 = OpFunctionCall %void %block_ret_struct %res_0
               OpReturn
               OpFunctionEnd

Notice missing OpTypeStruct for %struct.A = type { i32 }. The issue will need to be narrowed down. There are also big differences in the typed pointer mode between the backend and the translator on this test case.

unreferenced global variables are not processed

If a global variable is not referenced in functions of a module, llc does not process it properly and does not generate necessary SPIRV data. A solution would be to add this processing in the SPIRVGlobalTypesAndRegNum pass. However there are tests (capability-integers.ll, transcoding/image_with_access_qualifiers.ll) which require the SPIRVAddRequirements pass (which precedes the GlobalTypes... pass) for their variables to output correct OpCapabilities.

Other tests may be affected too:

transcoding/OpVariable_Initializer.ll
transcoding/global-constant-expression.ll 
transcoding/PipeStorage.ll
link-attribute.ll
transcoding/undef_initializer.ll

LLVM ERROR: unable to translate instruction: call

After 906a4a8 (#52) we have 7 compilation fails with "unable to translate instruction: call" on tests:

linkage-types.ll
layout.ll
llvm-intrinsics/instrprof.ll
llvm-intrinsics/constrained-comparison.ll
llvm-intrinsics/constrained-arithmetic.ll
llvm-intrinsics/constrained-convert.ll
transcoding/GlobalFunAnnotate.ll

and +4 after fixing "Broken function" issues (#57)

transcoding/annotate_attribute.ll
transcoding/spirv-private-array-initialization.ll
llvm-intrinsics/memcpy.align.ll
transcoding/spirv-private-array-initialization.ll

In transcoding/GlobalFunAnnotate.ll IRTranslator fails to convert llvm.spv.track.constant.a23i8.a23i8(...) to call due to a vector arg which "takes" more than one virtual reg (llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2398). I think the similar reasons causes fails on other tests.

Converting registers to `s32` for `ASSIGN_TYPE` interferes with instruction legality

In the instruction selector pass, there is a loop to ensure all the virtual registers for AssignType are mapped to LLT::scalar(32)

in SPIRVTargetMachine.cpp

  for (auto &MBB : MF) {
      for (auto &MI : MBB) {
        if (MI.getOpcode() == SPIRV::ASSIGN_TYPE) {
          auto &SrcOp = MI.getOperand(1);
          if (isTypeFoldingSupported(MRI.getVRegDef(SrcOp.getReg())->getOpcode()))
            MRI.setType(MI.getOperand(0).getReg(), LLT::scalar(32));
        }
      }
    }

The problem with this is that legal instructions become illegal after it.

Example

consider G_BITCAST

%363:id(<8 x s16>) = G_BITCAST %151:anyid(<4 x s32>) is legal according to SPIRVLegalizerInfo. After the mentioned loop all ASSIGN_TYPE first operands are mapped to s32. This instruction becomes
%363:id(<8 x s16>) = G_BITCAST %151:anyid(s32)
which is illegal because getSizeInBits for target and source are not equal

Current solution

What I have been doing is to relax the legalizer to ensure just the target confirms with SPIRV documentation. This is not an optimal way to deal with the problem and it feels more like a hack than an actual fix.


Would it possible to make tablegen not require s32 LLT? or can we use SPIRV types instead of LLVM types in SPIRVLegalizerInfo

Correct OpSpecConstantOp generation (CTS basic/progvar_prog_scope_init)

Compiling OpenCL_CTS_2021/basic/progvar_prog_scope_init/OCL_asm8433e5f712585356_before_spirv_backend.ll which has this fragment

a_var = addrspace(1) global [2 x <2 x i8>] [<2 x i8> <i8 1, i8 1>, <2 x i8> <i8 1, i8 1>], align 2
@p_var = addrspace(1) global <2 x i8> addrspace(1)* bitcast (i8 addrspace(1)* getelementptr (i8, i8 addrspace(1)* getelementptr inbounds ([2 x <2 x i8>], [2 x <2 x i8>] addrspace(1)* @a_var, i32 0, i32 0, i32 0), i64 2) to <2 x i8> addrspace(1)*), align 8

I get following output:

        %23 = OpVariable %9 CrossWorkgroup %31
        %24 = OpVariable %10 CrossWorkgroup %17
        %25 = OpVariable %8 CrossWorkgroup %16
        %26 = OpVariable %8 CrossWorkgroup %29
        %30 = OpConstant %2 0 
        %29 = OpConstantComposite %3 %30 %30
        %31 = OpSpecConstantOp %10 70 %24 %18 %19
        %35 = OpConstant %2 0 
        %34 = OpConstantComposite %3 %35 %35
        %36 = OpSpecConstantOp %10 70 %24 %18 %19
        %41 = OpConstant %2 0 
        %40 = OpConstantComposite %3 %41 %41
        %42 = OpSpecConstantOp %10 70 %24 %18 %19
        %63 = OpConstant %2 0 
        %62 = OpConstantComposite %3 %63 %63
        %64 = OpSpecConstantOp %10 70 %24 %18 %19

The SPIRV translator gives this output:

          %8 = OpConstantComposite %v2uchar %uchar_1 %uchar_1
...
         %13 = OpConstantComposite %_arr_v2uchar_ulong_2 %8 %8
      %a_var = OpVariable %_ptr_CrossWorkgroup__arr_v2uchar_ulong_2 CrossWorkgroup %13
         %19 = OpSpecConstantOp %_ptr_CrossWorkgroup_uchar PtrAccessChain %a_var %uint_0 %uint_0 %ulong_2
         %20 = OpSpecConstantOp %_ptr_CrossWorkgroup_v2uchar Bitcast %19
      %p_var = OpVariable %_ptr_CrossWorkgroup__ptr_CrossWorkgroup_v2uchar CrossWorkgroup %20

So we need to correct OpSpecConstantOp generation and avoid the instruction duplication.

Assertion "Calling a function with a bad signature!" failed (from SPIRVPreTranslationLegalizer.cpp)

After 906a4a8 (#52) 5 tests fail on `(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && "Calling a function with a bad signature!"' assertion in llvm/lib/IR/Instructions.cpp:495:

llvm-intrinsics/umul.with.overflow.ll
transcoding/SpecConstantComposite.ll
AtomicCompareExchange.ll
instructions/nested-composites.ll
instructions/call-complex-function.ll

All originates from SPIRVPreTranslationLegalizer.cpp

LLVM ERROR: unable to legalize instruction (G_BITREVERSE, G_STORE, G_UITOFP)

There are 4 tests with "LLVM ERROR: unable to legalize instruction" issue:

transcoding/OpBitReverse_v2i16.ll: unable to legalize instruction: %16:anyid(<2 x s16>) = G_BITREVERSE %3:anyid (in function: testBitRev)
transcoding/OpBitReverse_i32.ll: unable to legalize instruction: %11:anyid(s32) = G_BITREVERSE %2:anyid (in function: testBitRev)
transcoding/extract_insert_value.ll: unable to legalize instruction: G_STORE %17:id(s224), %11:id(p1) :: (store 28 into %ir.0, align 4, addrspace 1) (in function: array_test)
uitofp-with-bool.ll: unable to legalize instruction: %14:anyid(s32) = G_UITOFP %10:anyid(s1) (in function: K)

AtomicCompareExchangeWeak is deprecated after v1.3

The commit a4161be "[SPIR-V] Add missing OpAtomicCompareExchangeWeak instruction" caused the SPIRV verifier degradation on one LIT test transcoding/AtomicCompareExchangeExplicit_cl20.ll because OpAtomicCompareExchangeWeak is missing after version 1.3.

Missed OpCapability

There is missed OpCapability in 10 tests:
OpCapability Int64Atomics in capability-Int64Atomics.ll, in capability-Int64Atomics-store.ll
OpCapability Int8 in capability-integers.ll
OpCapability Float16Buffer in half_extension.ll
OpCapability SampledBuffer in image_dim.ll
OpCapability DeviceEnqueue in spirv.Queue.ll
OpCapability SubgroupDispatch in transcoding/ReqdSubgroupSize.ll
OpCapability SampledBuffer in transcoding/cl-types.ll
OpCapability ImageBasic in transcoding/image_with_access_qualifiers.ll
OpCapability Float64 in transcoding/optional-core-features-multiple.ll

Missed constant declaration after DT.add in IRTranslator

Compiling transcoding/spec_const.ll with supported OpSpecConstant we get spirv code with missed constant declaration:

(no %19 = ...)
...
        %21 = OpConstant %6 1
...
        %45 = OpSelect %2 %43 %21 %19

The constant exists after regular IRTranslator run, and it's added to DT in addConstantsToTrack. Then the declaration is deleted (in foldConstantsIntoIntrinsics/generateAssignInstrs) but DT's entry stays. Later building 0 constant in ISel we get a reference to register with no definition.

Perhaps we need to add constants to DT after foldConstantsIntoIntrinsics/generateAssignInstrs or even leave it till ISel.

Getting started guide and other documentation

Hi, I read that the SPIRV target is being merged into LLVM trunk. I'd be interested in using it. Is there a getting started guide for compiler developers? I wrote a C++ compiler that already supports CUDA, and SPIR-V and DXIL shaders, and it makes a lot of sense for me to target SPIR-V compute if you think it's ready.

New failing lit tests / Rebase to the upcoming LLVM 14

The repository's default branch was switched to feature/spirv-backend-llvm14 which has all the SPIRV backend related changes rebased on top of the 3e2bd82f "Revert "[OptTable] Improve error message output for grouped short options"" commit from the LLVM trunk.

Most of the problems coming from the rebase were solved in the commits referenced below:

  1. db04b750 "Fixes after rebase: Fixes to the commit 68befe3"
  2. c4557ff2 "Fixes after rebase"
  3. f7bdd8bf "Fixes after rebase: Fixes to commit d811090"
  4. 5495374d "Fixes after rebase: Fixes to the commit d35e091"
  5. 1732c070 "Fixes after rebase: Fixes to the commit 7565b9b"
  6. d796e6c7 "Fixes after rebase: Fixes to the commit 5469c98"
  7. d796e6c7 "Fixes after rebase: Fixes to the commit 5469c98"

Additionally, the following lit tests are failing due to the rebase and are still unresolved:

LLVM :: CodeGen/SPIRV/instructions/atomic.ll
LLVM :: CodeGen/SPIRV/instructions/atomic_acqrel.ll
LLVM :: CodeGen/SPIRV/instructions/atomic_seq.ll
LLVM :: CodeGen/SPIRV/instructions/fcmp.ll
LLVM :: CodeGen/SPIRV/instructions/float-casts.ll
LLVM :: CodeGen/SPIRV/instructions/icmp.ll
LLVM :: CodeGen/SPIRV/instructions/integer-casts.ll
LLVM :: CodeGen/SPIRV/instructions/intrinsics.ll
LLVM :: CodeGen/SPIRV/instructions/ptrcmp.ll
LLVM :: CodeGen/SPIRV/instructions/select.ll

UNREACHABLE executed at SPIRVGlobalTypesAndRegNumPass.cpp:253

After 906a4a8 (#52) 5 tests fail with UNREACHABLE executed at SPIRVGlobalTypesAndRegNumPass.cpp:253:

lshr-constexpr.ll
transcoding/relationals_half.ll
transcoding/relationals_float.ll
transcoding/relationals_double.ll
constant/local-vector-matrix-constants.ll

E.g. (in transcoding/relationals_float.ll) metaRegID is missed in LocalAliasTables for 2nd operand of "%14:id = OpVectorShuffle %3:type, %118:id, %117:id, 0".

Missed OpSwitch

There is missed OpSwitch in 4 tests:

transcoding/OpSwitchEmpty.ll
transcoding/OpSwitchChar.ll
transcoding/OpSwitch64.ll
transcoding/OpSwitch32.ll

Exception in hoistGlobalOps (SPIRVGlobalTypesAndRegNumPass.cpp)

There are 3 tests with this exception: terminate called after throwing an instance of 'std::out_of_range' - transcoding/KernelArgTypeInOpString2.ll, transcoding/builtin_vars_arithmetics.ll, transcoding/builtin_vars_opt.ll
It's out of range in map access:

...
#11 0x00007fe6e64bc3be std::__throw_out_of_range(char const*) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xa13be)
#12 0x000055cfab9c1ce3 std::map<llvm::Register, llvm::Register, std::less<llvm::Register>, std::allocator<std::pair<llvm::Register const, llvm::Register> > >::at(llvm::Register const&) /usr/include/c++/9/bits/stl_map.h:540:10
#13 0x000055cfab9bf87a void hoistGlobalOps<llvm::Type>(llvm::MachineIRBuilder&, llvm::SPIRVDuplicatesTracker<llvm::Type> const*, MetaBlockType, std::map<llvm::MachineFunction*, std::map<llvm::Register, llvm::Register, std::less<llvm::Register>, std::allocator<std::pair<llvm::Register const, llvm::Register> > >, std::less<llvm::MachineFunction*>, std::allocator<std::pair<llvm::MachineFunction* const, std::map<llvm::Register, llvm::Register, std::less<llvm::Register>, std::allocator<std::pair<llvm::Register const, llvm::Register> > > > > >&) llvm/lib/Target/SPIRV/SPIRVGlobalTypesAndRegNumPass.cpp:218:22
#14 0x000055cfab9bcb33 hoistInstrsToMetablock(llvm::Module&, llvm::MachineModuleInfo&, llvm::MachineIRBuilder&, std::map<llvm::MachineFunction*, std::map<llvm::Register, llvm::Register, std::less<llvm::Register>, std::allocator<std::pair<llvm::Register const, llvm::Register> > >, std::less<llvm::MachineFunction*>, std::allocator<std::pair<llvm::MachineFunction* const, std::map<llvm::Register, llvm::Register, std::less<llvm::Register>, std::allocator<std::pair<llvm::Register const, llvm::Register> > > > > >&, SPIRVRequirementHandler&) llvm/lib/Target/SPIRV/SPIRVGlobalTypesAndRegNumPass.cpp:327:27
#15 0x000055cfab9bf1fd (anonymous namespace)::SPIRVGlobalTypesAndRegNum::runOnModule(llvm::Module&) llvm/lib/Target/SPIRV/SPIRVGlobalTypesAndRegNumPass.cpp:725:26
#16 0x000055cfac67fa01 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1702:20
...

Global vars in modules without functions may be incorrectly processed

We introduced empty kernel test_atomic_fn() in capability-integers.ll to force processing of global variables and passing the test since modules without functions are not processed properly (see #66 ). Probably we need to add a fake function automatically for each module without functions.

Another test without functions is transcoding/global-constant-expression.ll. By the way if I add empty function to it, its compilation fails with "Assertion `!hasDefs || resType || I.getOpcode() == TargetOpcode::G_GLOBAL_VALUE' failed" in SPIRVInstructionSelector.cpp:282. Actually the opcode is G_PTR_ADD which we get after IRTranslator:

  %10:_(p1) = G_PTR_ADD %8:_, %11:anyid(s32)
  %12:_(p1) = COPY %8:_(p1)
  %14:_(p1) = G_GLOBAL_VALUE @k_var
  %22:anyid(s32) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.const.composite), %4:anyid(s8), %4:anyid(s8)
  %23:type(s32) = OpTypeArray %16:type(s32), %18:id(s32)
  %6:anyid(s32) = ASSIGN_TYPE %22:anyid(s32), %23:type(s32)
  G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.init.global), %8:_(p1), %6:anyid(s32)
  %15:anyid(s32) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.spv.const.composite), %10:_(p1), %12:_(p1)

OpVectorExtractDynamic usage instead of OpCompositeExtract

OpVectorExtractDynamic usage instead of OpCompositeExtract breaks transcoding/sub_group_ballot.ll test after adding all missed builtins. For instance we get:

        %5 = OpTypeInt 32 0
...
        %53 = OpConstant %5 0
...
        %136 = OpGroupNonUniformBroadcast %6 %54 %135 %53
        %137 = OpVectorExtractDynamic %4 %136 %53
        %138 = OpGroupNonUniformBroadcastFirst %4 %54 %137
        OpReturn

but the following code is expected:

        %136 = OpGroupNonUniformBroadcast %6 %54 %135 %53
        %137 = OpCompositeExtract %4 %136 0
        %138 = OpGroupNonUniformBroadcastFirst %4 %54 %137
        OpReturn

Missed OpDecorate

There are missed OpDecorate in 3 tests:
OpDecorate %[[PTR_ID]] MaxByteOffset 12 in transcoding/DecorationMaxByteOffset.ll
OpDecorate %[[GlobalInvocationId:[0-9]+]] BuiltIn GlobalInvocationId and others in transcoding/builtin_vars_arithmetics.ll
OpDecorate %[[#SG_MaxSize_BI:]] BuiltIn SubgroupMaxSize and others in transcoding/builtin_vars_opt.ll

Support llvm memory intrinsics

We have 5 tests with unsupported llvm memory intrinsics. The translator implements llvm.memcpy using OpCopyMemorySized memory instruction. For llvm.memmove intrinsic the translator also uses OpCopyMemorySized when memory has constant size (test memmove.ll) but it inlines the intrinsic as a loop with OpLoad/OpStore instructions for non-determined memory size, i.e. passed as a function argument (test dynamic-memmove.ll). For llvm.memset it is uses either OpCopyMemorySized or generates new functions with loops containing OpStores in the case of the initialization value is not a constant (passed as a function argument again).

llvm-intrinsics/memcpy.align.ll (llvm.memcpy* -> OpCopyMemorySized)
transcoding/spirv-private-array-initialization.ll  (llvm.memcpy* -> OpCopyMemorySized)
llvm-intrinsics/memmove.ll (llvm.memmove* -> OpCopyMemorySized)
llvm-intrinsics/dynamic-memmove.ll (llvm.memmove* -> inlined loops with OpLoad/OpStore)
llvm-intrinsics/memset.ll (llvm.memset*  -> OpCopyMemorySized or OpFunction)

I think we should start with a simple llvm.memcpy* to OpCopyMemorySized conversion, then support the loop inlining. The generation of new functions is used in the translator to support some other intrinsics (llvm.usub.sat..., llvm.fshl...), and perhaps can be implemented in the backend. But I'm not sure for now that we really need to support both inlining and the generation of new functions for llvm intrinsics support in our project.

Generic issues

This is a list of tasks that it would be nice to complete before the corresponding parts of our codebase go to LLVM repository:

  1. Remove unnecessary files, classes, definitions, includes, variables, comments...
  2. Correct coding style where necessary.
    I run clang-format and capitalize names of variables, maybe additional things are required.
  3. Move files/code to more appropriate places, give more suitable names for some files/classes/functions.
    SPIRVInstPrinter* was moved to MCTargetDesc. I'm going to rename SPIRVStrings* to SPIRVUtils* (and move some more utility code to it) and SPIRVTypeRegistry* to SPIRVGlobalRegistry* so the last one will officially support Types, Constants and Global Variables.
  4. If possible, get rid of exotic solutions in the implementation.
    We removed abnormal changes in target-independent part. However there are some weird things inside SPIRV-specific code. Currently I suppose we should avoid implementing these classes: SPIRVIRTranslator and SPIRVInstructionSelect. The functionality of the first one may be moved to a separate pass between IRTranslator and Legalizer (e.g. in addPreLegalizeMachineIR). The second one probably can be avoided by simplifying RegisterBanks.

Constant insturctions and global variables in function bodies

There are tests with OpConstant and OpVariable in function bodies, e.g.:

AtomicCompareExchange_cl20.ll (OpConstant)
opencl/get_global_id.ll (OpConstant, OpVariable Input)
transcoding/OpConstantSampler.ll (OpConstant)
image.ll (OpConstant, OpCostantSampler, OpVariable Input)

According to item 9 in 2.4 Logical Layout of a Module (SPIR-V Specification Version 1.5, Revision 5) they have to be placed before all function declarations.

Entrypoint wrapper and Function with LinkageAttributes

The translator translates LLVM spir_kernel into an entrypoint wrapper and a function with LinkageAttributes. In entry_point_func.ll test we have:

define spir_kernel void @testfunction() {
   ret void
}

and after the translator:

...
               OpEntryPoint Kernel %6 "testfunction"
...
               OpName %testfunction "testfunction"
               OpDecorate %testfunction LinkageAttributes "testfunction" Export
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
%testfunction = OpFunction %void None %3
          %5 = OpLabel
               OpReturn
               OpFunctionEnd
          %6 = OpFunction %void None %3
          %7 = OpLabel
          %8 = OpFunctionCall %void %testfunction
               OpReturn
               OpFunctionEnd

We probably need to do the same.

Assertion `OldReg == R' failed in SPIRVDuplicatesTracker.h

6 tests fails after 906a4a8 (#52) on the assertion `OldReg == R' in SPIRVDuplicatesTracker.h:

transcoding/isequal.ll
transcoding/OpAllAny.ll
transcoding/OpImageQuerySize.ll
transcoding/check_ro_qualifier.ll
transcoding/OpImageSampleExplicitLod.ll
opencl/image.ll

+2 after #57 :

transcoding/builtin_vars_opt.ll
transcoding/OpImageReadMS.ll

Support opaque pointers

LLVM is switching to opaque pointers and we need to support this in the backend. The general idea is to avoid getPointerElementType() and use getNonOpaquePointerElementType() instead (https://reviews.llvm.org/D116464#inline-1188188). Currenlty we'll convert LLVM’s opaque pointers to SPIRV’s i8* pointers (or to another type, it can be customized) until we have explicit support for opaque pointers in the new SPIRV specification.

Related issue in SPIRV translator's repository KhronosGroup/SPIRV-LLVM-Translator#1444 .

Assertion "!hasDefs || resType" failed in SPIRVInstructionSelector.cpp

There are 3 tests (transcoding/OpSwitch32.ll, transcoding/OpSwitchChar.ll, transcoding/OpSwitch64.ll) failed as:

virtual bool {anonymous}::SPIRVInstructionSelector::select(llvm::MachineInstr&): Assertion '!hasDefs || resType' failed.

#10 0x0000563e95e2961a (anonymous namespace)::SPIRVInstructionSelector::select(llvm::MachineInstr&) llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp:249:16
#11 0x0000563e977feb24 llvm::InstructionSelect::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp:136:11
#12 0x0000563e95d8b7c8 (anonymous namespace)::SPIRVInstructionSelect::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp:232:59
#13 0x0000563e9651febb llvm::MachineFunctionPass::runOnFunction(llvm::Function&) llvm/lib/CodeGen/MachineFunctionPass.cpp:73:33

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.