Git Product home page Git Product logo

cloc's People

Contributors

alienanthill avatar ashwinma avatar bensander avatar facao avatar jedwards-amd avatar scchan avatar srp1970 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloc's Issues

Using CLOC to compile OpenCL kernels to ROCm platform with Fiji GPU

Is is possible to use CLOC to compile OpenCL kernels that can be used with either a ROCm platform (with NVIDIA Fiji GPU) or with the older AMD Catalyst drivers on Linux?

Before knowing about CLOC, I was thinking to to compile OpenCL kernels with Clang/LLVM and the libclc (http://libclc.llvm.org/) library.
But I kind of gave up on that idea after multiple trials, as I could not find a way to use the GCN assembly generated from the Clang/LLVM AMD GCN code generator to assemble a binary that could be loaded by the OpenCL function clCreateProgramWithBinary.

Thanks in advance!

How to make pass-by-value struct in the kernel code work in CLOC 1.0.x?

Hi developer,

When I leverage the new CLOC toolchain, I am struggled to make a kernel that has a struct in the argument work.

Consider the following simple kernel code

typedef struct SimpleStruct
{
    int a;
    int b;
} SimpleStruct;

__kernel void struct_copy(SimpleStruct a, global SimpleStruct *b)
{
    int i = get_global_id(0);
    b[i] = a;
}

The host side may have the following kernel argument structure

typedef struct __attribute__((aligned(16))) {
    int a;
    int b;
} SimpleStruct;

typedef struct __attribute__((aligned(16))) _KernelArg {
#ifdef BRIG
    uint64_t global_offset_0;
    uint64_t global_offset_1;
    uint64_t global_offset_2;
    uint64_t printf_buffer;
    uint64_t vqueue_pointer;
    uint64_t aqlwrap_pointer;
#endif
    SimpleStruct *a;
    SimpleStruct *b;
} KernelArg;

In the older hlc 3.2, after doing some survey on the semantic of the hsail code, I put the pass-by-value struct in the global memory and successfully copy the content to b on the GPU. It works.

However, in the new toolchain, it seems that the way that work in the older toolchain now doesn't work. Maybe the assumption that I made in the older toolchain is changed. The content of b isn't correct. How do I set the content of a now? How do I make such cases work in the CLOC toolchain?

Any advice or hint would be appreciated. Thanks.

Is CLOC open source or will it be?

I would really like to have a look at the source.
If that is not realistic, is there another sample project for a language-to-HSAIL compiler?

snack helloworld example generate .snackwrap.c even though compiling with -c option

Hi.
Few weeks ago, I bought kaveri A10-7850K from AMD which supports hsa and made the full computer
for the kaveri with DRAM, SSD etc. I'm trying to compare the performance between hsa mode and non-hsa mode using the kaveri.
First, I want to compile the OpenCL example in hsa mode. I installed hsa runtime, driver, and CLOC. When I tried to compile the helloworld example in snack directory, I snacked the hw.cl with the command 'snack.sh -c hw.cl'. However, many lines of errors appear such as
/tmp/snk_17760/hw.snackwrap.c:148:40: error: unknown type name 'hsa_ext_brig_module_t'
hsa_ext_brig_module_t* brig_module,
^
Even though I used the -c option, it generates .snackwrap.c. What should I do to solve these errors?
And does anyone have any Idea to compare the performance between hsa mode and non-hsa mode by using OpenCL code?

Segmentation fault when compiling this OpenCL kernel

I am encountering a segmentation fault when compiling this OpenCL kernel to Brig or Hsail text:
__kernel void test_kernel(__global int* pVal) {
pVal[1] = 123;
if (pVal[2] == 0) {
while (pVal[0] == 0) {
pVal[1] = 123;
}
}
pVal[1] = 321;
}

Here is the cloc.sh compilation console output:
0 llc 0x0000000000b1a3af
1 llc 0x0000000000b1a724
2 libpthread.so.0 0x00007f8b73258340
3 llc 0x0000000000483c70
4 llc 0x0000000000ac2e4f
5 llc 0x0000000000ac3243
6 llc 0x0000000000ac34ff
7 llc 0x0000000000ac363c
8 llc 0x000000000042bf76
9 libc.so.6 0x00007f8b72ea1ec5 __libc_start_main + 245
10 llc 0x0000000000438861
Stack dump:
0. Program arguments: /opt/amd/cloc/bin/llc -O2 -march=hsail-64 -filetype=asm -o /tmp/cloc4242/trap.hsail /tmp/cloc4242/trap.opt.bc

  1. Running pass 'Function Pass Manager' on module '/tmp/cloc4242/trap.opt.bc'.
  2. Running pass 'HSAIL control dependency analysis' on function '@__OpenCL_test_kernel_kernel'
    /opt/amd/cloc/bin/cloc.sh: line 333: 4262 Segmentation fault (core dumped) $HSA_LLVM_PATH/$CMD_LLC -o $TMPDIR/$FNAME.hsail $TMPDIR/$FNAME.opt.bc
    ERROR: The following command failed with return code 139.
    /opt/amd/cloc/bin/llc -O2 -march=hsail-64 -filetype=asm -o /tmp/cloc4242/trap.hsail /tmp/cloc4242/trap.opt.bc

Notice, if the while loop is empty, or without the if condition, cloc.sh compilation runs fine.

Error when using "-g -hsail" option on OpenCL kernel

Hi,
I am getting an error message when trying to use cloc.sh to get hsail text from CL kernel with "-g" flag. For example, for this kernel:

__kernel void trap()
{}

using command:

cloc.sh -hsail -g trap.cl

Then I get this message

#Info:  Version:    cloc.sh 0.9.8
#Info:  Input file: /home/user/workspace/CLWorkSpace/trap.cl
#Info:  Brig file:  /home/user/workspace/CLWorkSpace/trap.brig
#Info:  HSAIL file: /home/user/workspace/CLWorkSpace/trap.hsail
#Info:  Run date:   Mon Dec  7 11:54:20 EST 2015
#Info:  LLVM path:  /opt/amd/cloc/bin
#Step:  Compile(clc2)   cl --> bc ...
#Step:  Link(llvm-link) bc --> lnkd.bc ...
#Step:  Optimize(opt)   lnkd.bc --> opt.bc -O2 ...
#Step:  llc arch=hsail  opt.bc --> hsail ...
#Step:  HSAILasm    hsail --> trap.brig -O2 ...
Using libHSAIL options (append from env variable LIBHSAIL_OPTIONS_APPEND): '-o /home/user/workspace/CLWorkSpace/trap.brig /tmp/cloc5129/trap.hsail -g -include-source'
#Step:  HSAILasm    brig --> trap.hsail ...
Using libHSAIL options (append from env variable LIBHSAIL_OPTIONS_APPEND): '-disassemble -o /home/user/workspace/CLWorkSpace/trap.hsail /home/user/workspace/CLWorkSpace/trap.brig -g -include-source'
Invalid section size: must be a multiple of 4
ERROR:  The following command failed with return code 1.
        HSAILasm -disassemble -o /home/user/workspace/CLWorkSpace/trap.hsail /home/user/workspace/CLWorkSpace/trap.brig

No hsail file is generated.

how to disable xnack

Hi,
I"m trying to run AMDOX on V1000 (RV) and get an error;

MIOpen(OpenCL): Warning [FindRecordUnsafe] File is unreadable: /opt/rocm/miopen/share/miopen/db/gfx902-xnack_11.cd.pdb.txt
'gfx902-xnack' is not a recognized processor for this target (ignoring processor)
'gfx902-xnack' is not a recognized processor for this target (ignoring processor)
:398:5: error: Only Gfx8 and Gfx9 ISA is supported
.error "Only Gfx8 and Gfx9 ISA is supported"
^
:1038:1: error: unmatched .ifs or .elses

any idea how to disable xnack on RV/V1000?

Thanks in advance
Best Regards
Mazda

CLOC cannot generate *.brig file.

__kernel void vec_assign(__global int *local_out)
{
int id = get_global_id(0);
local_out[id] = 1;
}

I am using HLC development version.

Manual installation

Where should I place the binaries when manually installing CLOC? I'm neither using a apt or a rpm distribution.

CLOC 1.0.14 gives wrong answer on OpenCL builtin log function with -noshared options

Hi developers,

The OpenCL log function gives wrong answer when the kernel code is compiled with -noshared option. The example seems to work when I use CLOC version 1.0.10.

How to reproduce: Use different CLOC compiler version with or without -noshared flag

Expected output:
[Use CLOC 1.0.10. compiler option: cloc.sh vector_copy_codeobject.cl]

Using <Carrizo>
found device region, flags=3
in[i]: 1.0000000. log(in[i]): 0.0000000. out[i]: 0.0000000
in[i]: 2.0000000. log(in[i]): 0.6931472. out[i]: 0.6931472
in[i]: 3.0000000. log(in[i]): 1.0986123. out[i]: 1.0986123
in[i]: 4.0000000. log(in[i]): 1.3862944. out[i]: 1.3862944

[Use the latest CLOC 1.0.14, cloc.sh -noshared vector_copy_codeobject.cl]

Using <Carrizo>
found device region, flags=3
in[i]: 1.0000000. log(in[i]): 0.0000000. out[i]: 0.0000000
in[i]: 2.0000000. log(in[i]): 0.6931472. out[i]: 209364525475383353885435186370676680282084727406086750235054469216932631113440149497314782176084161203325974117941428988908447850068013206201794964536655755620876334588525370905934888334849975310646049531267822034603424551295632519500267324121017618049079602415671709738904192853945592320974745763840.0000000
in[i]: 3.0000000. log(in[i]): 1.0986123. out[i]: 0.6931472
in[i]: 4.0000000. log(in[i]): 1.3862944. out[i]: 209364525475383353885435186370676680282084727406086750235054469216932631113440149497314782176084161203325974117941428988908447850068013206201794964536655755620876334588525370905934888334849975310646049531267822034603424551295632519500267324121017618049079602415671709738904192853945592320974745763840.0000000

A log-testcase.tar.gz which I modified from vector_copy_codeobject_rocm is attached.

I haven't tested other builtin math function like sqrt used in my own kernel. I'll update the issue if I find there is a bug in other builtin math function.

Thanks,
Li

Error when using "-g

Hi,
When I am trying to use cloc.sh to get hsail text from hsail kernel, for example, from this OpenCL kernel:

__kernel void trap(__global atomic_int* pVal)
{
    pVal[0] = 42;
    pVal[1] = 23;
}

using command:

cloc.sh -hsail -g trap.cl

Then I get this message

Using libHSAIL options (append from env variable LIBHSAIL_OPTIONS_APPEND): '-o /home/user/workspace/CLWorkSpace/trap.brig /tmp/cloc5058/trap.hsail -g -include-source'
Using libHSAIL options (append from env variable LIBHSAIL_OPTIONS_APPEND): '-disassemble -o /home/user/workspace/CLWorkSpace/trap.hsail /home/user/workspace/CLWorkSpace/trap.brig -g -include-source'
Invalid section size: must be a multiple of 4
ERROR:  The following command failed with return code 1.
        HSAILasm -disassemble -o /home/user/workspace/CLWorkSpace/trap.hsail /home/user/workspace/CLWorkSpace/trap.brig

No hsail file is generated.

CLOC not working with master branch of Driver and HSA-RunTime

Hello,
I have used the master branch of HSA-Driver (commit ae607cf8ae2b94eab9f3b21b0e15a523c463fcb1) HSA-Runtime (commit 8c9999d39cac965eb4c55025abd607e7b497264a) and CLOC (commit cf77785 ). I am able to run the samples under HSA-Runtime-AMD/sample. without any issues.
But its sample under CLOC/example/hsa/vector_copy/ is causing segmentation fault. I tried comparing the .brig file generated from CLOC and .brig file already present under "HSA-Runtime-AMD/sample". Both seems to be different.
cas@caspc:~/HSA-Drivers-Linux-AMD/HSA-Runtime-AMD/sample$ file vector_copy.brig
vector_copy.brig: data

cas@caspc:~/ravishWork/AcceleratedEmbeddedModules_20150625/samples/reSize/bin/x86_64/Release$ file vector_copy.brig
vector_copy.brig: ELF 32-bit LSB no file type, no machine, version 1 (SYSV)

How do i get CLOC to generate same .brig file ?

Device-side-enqueue throws exception during runtime.

Hi,
I have trouble running kernels which are using the OpenCL 2.0 feature enqueue_kernel. The offline compilation runs without errors, but dispatching raises a runtime exception:
kfd kfd: Invalid PPR device 0:1.0 pasid 1 address 0x0 flags 0x124

Sourcecode:

__kernel void storeGlobalId(__global size_t* output, const size_t n) {
   const size_t gid=get_global_id(0);
   if (gid<n) output[gid]=gid;
}

__kernel void enqueueStoreGlobalId(__global size_t* output, const size_t n) {
   const size_t gid=get_global_id(0);
   if (gid == 0) {
      queue_t queue = get_default_queue();
      ndrange_t nd = ndrange_1D(n,128);

      void (^storeGlobalId_block)(void) = ^ {
         storeGlobalId(output,n);
      };

      enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_NO_WAIT, nd, storeGlobalId_block);
   }
}

System setup:

  • Kaveri APU (no dGPU)
  • Kernel 4.0.0-100002-generic #201511031149 SMP
  • kfd-v1.6.1 (7fb04c4 from git repo HSA-Drivers-Linux-AMD)
  • HSA-Runtime 1.0.3 (fa0ef7e from git repo HSA-Runtime-AMD)
  • CL offline compiler (CLOC) v0.9.8

Cannot install together with ROCm

Today I was trying to install CLOC 1.0 branch. I have ROCm from install using apt-get install rocm. I saw all the runtime and compilers are nicely placed in /opt/rocm. CLOC is also in one of the folders. However, if I use the cloc from ROCm, it would not be able to find llvm and hsa runtime.

Then when I tried to install CLOC with the .deb packages, it gave me an error saying that the hsa-runtime-dev was not installed. The new ROCKRuntime package does not have an installable package anymore.

Is there a way to let CLOC work with ROCm?

CLOC generates GCN instructions, HSAILasm cannot assemble them

When I was compiling some program with snackhsail.sh (I do not know when it changed to snackhsail, the old snack.sh does not work properly. Is AMD giving up HSAIL?), I notice some instruction is not compatible with HSA tool chains. I got the following error:

>       gcn_min_f32     $s5, $s9, $s5;
>              ^
input(124,9): Undefined instruction

and

>       gcn_divrelaxed_ftz_f32  $s4, $s5, $s4;
>                         ^
input(353,20): Undefined instruction

How does gcn_min different the min instruction in HSAIL and how does the gcn_divrelaxed different from the div instruction in HSAIL? Should the CLOC compile the HSAIL program or only compiles for AMD GCN devices? How can another vendor use CLOC or even HSAIL if it generates GCN specific instructions?

CLOC 1.0.x addrspacecast not implemented

Hi developers,

When using cloc 0.9 (which is based on llvm 3.2), one of the example, in AMD APP SDK, MonteCarloAsianDP_Kernels.cl can work.

However, when the new LLVM toolchain uses the addrspacecast to generate the kernel code, CLOC emits the error message like error: addrspacecast not implemented.

For now, I can still use cloc -hsail MonteCarloAsianDP_Kernels.cl to use the old version (hlc3.2) of LLVM toolchain to run my application. This issue can also be found in the current LLVM trunk.

So, how to work around this kind of problem when using new CLOC toolchain now? There are some benchmarks that fail to compile. [EDIT: I think a viable solution may be trying to not use generic address space in this case.]

Thanks.

The aforementioned example is in the attachment.
Kernel code
LLVM Assembly code

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.