Git Product home page Git Product logo

Comments (41)

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024 1

Nice, it helped a lot:

93% tests passed, 3 tests failed out of 41

Total Test time (real) =  38.24 sec

The following tests FAILED:
         36 - clblast_test_xsyrk (Failed)
         38 - clblast_test_xsyr2k (OTHER_FAULT)
         39 - clblast_test_xher2k (SEGFAULT)

I'll post the remaining errors details later.

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Hmm, not good. So there are two types of tests: the regular behaviour with proper input arguments and invalid buffer sizes with funny input arguments, such as zero-sized or too small buffers. Apparently only the latter type fails.

I assume this is on the development branch, given that the verbose output is much more verbose than in the latest version. I re-ran the same command just now on my machine with Beignet on Linux and Intel(R) HD Graphics Skylake ULT GT2 (almost the same), and everything is fine. However, I noticed that verbose mode doesn't output extra information for the invalid buffer sizes cases, perhaps I should add that to get a little bit of extra information why things go wrong.

The first thing we should try is to find out whether clBLAS (the reference) or CLBlast crashes. Perhaps you can go to line 218 of correctness/testblas.cc and change:

auto status1 = run_reference_(args, buffers1, queue_);
into:
auto status1 = StatusCode::kSuccess;

If it still crashes then the bug is in CLBlast, otherwise it is in clBLAS (not unreasonable to thing since that library hasn't been tested on Intel/Beignet).

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

I did your test and tried both clblas and clblast alone and they both segfault ...
I tried to skip both at the same time and I get 23/40 failed.

43% tests passed, 23 tests failed out of 40

Total Test time (real) = 134.66 sec

The following tests FAILED:
         11 - clblast_test_xgemv (SEGFAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         20 - clblast_test_xtbmv (SEGFAULT)
         21 - clblast_test_xtpmv (SEGFAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         28 - clblast_test_xhpr2 (Failed)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         32 - clblast_test_xspr2 (Failed)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)

Looks like I assumed they all failed the same way a little fast ...
I think I'll open a new issue when this one is solved. One issue at a time !

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

OK, it seems there is something else causing issues. First of all I've made the invalid-buffer sizes more verbose in verbose mode. For example, the ./clblast_test_xswap -verbose command would now output:

* Running on OpenCL device 'Iris Pro'.
* Starting tests for the 'SSWAP' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   . -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for 'default':
   Config: n=7 incx=1 incy=1 offx=0 offy=0 -> :
   Config: n=7 incx=1 incy=2 offx=0 offy=0 -> :
   Config: n=7 incx=1 incy=7 offx=0 offy=0 -> :
   Config: n=7 incx=2 incy=1 offx=0 offy=0 -> :
   Config: n=7 incx=2 incy=2 offx=0 offy=0 -> :
   Config: n=7 incx=2 incy=7 offx=0 offy=0 -> :
   Config: n=7 incx=7 incy=1 offx=0 offy=0 -> :
   Config: n=7 incx=7 incy=2 offx=0 offy=0 -> :
   Config: n=7 incx=7 incy=7 offx=0 offy=0 -> :
   Config: n=93 incx=1 incy=1 offx=0 offy=0 -> :
   Config: n=93 incx=1 incy=2 offx=0 offy=0 -> :
   Config: n=93 incx=1 incy=7 offx=0 offy=0 -> :
   Config: n=93 incx=2 incy=1 offx=0 offy=0 -> :
   Config: n=93 incx=2 incy=2 offx=0 offy=0 -> :
   Config: n=93 incx=2 incy=7 offx=0 offy=0 -> :
   Config: n=93 incx=7 incy=1 offx=0 offy=0 -> :
   Config: n=93 incx=7 incy=2 offx=0 offy=0 -> :
   Config: n=93 incx=7 incy=7 offx=0 offy=0 -> :
   Config: n=4096 incx=1 incy=1 offx=0 offy=0 -> :
   Config: n=4096 incx=1 incy=2 offx=0 offy=0 -> :
   Config: n=4096 incx=1 incy=7 offx=0 offy=0 -> :
   Config: n=4096 incx=2 incy=1 offx=0 offy=0 -> :
   Config: n=4096 incx=2 incy=2 offx=0 offy=0 -> :
   Config: n=4096 incx=2 incy=7 offx=0 offy=0 -> :
   Config: n=4096 incx=7 incy=1 offx=0 offy=0 -> :
   Config: n=4096 incx=7 incy=2 offx=0 offy=0 -> :
   Config: n=4096 incx=7 incy=7 offx=0 offy=0 -> :
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for 'default':
   Config: n=64 xsize=0 ysize=0 -> .
   Config: n=64 xsize=0 ysize=63 -> .
   Config: n=64 xsize=0 ysize=64 -> .
   Config: n=64 xsize=63 ysize=0 -> .
   Config: n=64 xsize=63 ysize=63 -> .
   Config: n=64 xsize=63 ysize=64 -> .
   Config: n=64 xsize=64 ysize=0 -> .
   Config: n=64 xsize=64 ysize=63 -> .
   Config: n=64 xsize=64 ysize=64 -> .
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   36 test(s) passed
   0 test(s) skipped
   0 test(s) failed

In the last bit, it shows that it is testing swapping of two buffers with 64 elements using smaller sized buffers. Both clBLAS and CLBlast are protected against this behaviour and return appropriate error codes.

One more thing I could think of now is that Beignet isn't happy with zero-sized buffers. Perhaps you can change line 66 of test/correctness/testblas.h from:

const std::vector<size_t> kVecSizes = {0, kBufferSize - 1, kBufferSize};

into:

const std::vector<size_t> kVecSizes = {kBufferSize - 1, kBufferSize};

Let's see if that helps for the xswap test.

For the other errors, I would first suggest to test against a CPU BLAS library, since the reference clBLAS might crash or give incorrect results in some cases on Intel GPUs. You can do this by providing -clblas 0 -cblas 1 to the command-line. Perhaps I should make this the default behaviour?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

I just tested the dev branch and the issue looks gone:

53% tests passed, 19 tests failed out of 40

Total Test time (real) =  70.75 sec

The following tests FAILED:
         11 - clblast_test_xgemv (SEGFAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)

There are 4 more tests that pass compared to the status1/status2 trick 3 days ago.

CBLAS is already the default:

$ ./clblast_test_xgemv -verbose 1 -clblas 0 -cblas 1

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [true]
    -clblas 0 [=default]
    -cblas 1 [=default]

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SGEMV' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   . -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 111 (regular)':
   Config: m=61 n=61 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=1 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=1 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 -> :
Segmentation fault

I tried with clblas as reference:

$ ./clblast_test_xgemv -verbose 1 -clblas 1 -cblas 0

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [true]
    -clblas 1 
    -cblas 0 

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SGEMV' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   . -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 111 (regular)':
   Config: m=61 n=61 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=1 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=1 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=1 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=2 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=2 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=1 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=2 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=61 incx=7 incy=7 offa=0 offx=0 offy=0 -> :
   Config: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 -> :
Segmentation fault

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

OK, I should have said this: the development branch now has the CPU BLAS as a default, so it is not testing against clBLAS anymore (unless you specify). I don't understand why the original issue doesn't show up anymore though...

From your other tests with GEMV we can conclude that the issue is indeed in CLBlast and not in one of the reference libraries. The configuration afterwards is:
Config: m=61 n=512 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 -> . The small dot there at the end denotes that it is an invalid configuration, i.e. the library should return with a status-code instead of actually trying to run it. I think that is the common thing across all your tests: it only fails for 'invalid' configurations.

Are you on the latest Beignet by the way? Perhaps that is influencing the results as well somehow?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

I use beignet 1.1.2.
What's your version ?

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

I'm on a git version from 2 weeks back. I had to do that because my Skylake GPU is quite new. But 1.1.2 seems to be from April this year, so that's quite recent.

I'll try to think of ways how to debug this property. But for now I think you can actually use the library: it only crashes for invalid configurations it seems.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

As some tuners fail too, maybe we can focus on that.
I'll post all failing unit tests too in case you see something obvious.

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

The tuners crash as well? I'll also investigate the current issue further, but I don't have time until Monday.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

That is weird. make alltuners fails during Xgemm reporting a seg fault:

[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (2607 ms) - 10 out of 117
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (170 ms) - 11 out of 117
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (545 ms) - 12 out of 117
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (2408 ms) - 13 out of 117
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (1366 ms) - 14 out of 117
CMakeFiles/alltuners.dir/build.make:57: recipe for target 'CMakeFiles/alltuners' failed
make[3]: *** [CMakeFiles/alltuners] Segmentation fault
CMakeFiles/Makefile2:146: recipe for target 'CMakeFiles/alltuners.dir/all' failed
make[2]: *** [CMakeFiles/alltuners.dir/all] Error 2
CMakeFiles/Makefile2:153: recipe for target 'CMakeFiles/alltuners.dir/rule' failed
make[1]: *** [CMakeFiles/alltuners.dir/rule] Error 2
Makefile:186: recipe for target 'alltuners' failed
make: *** [alltuners] Error 2

but running clblast_tuner_xgemm directly works fine ...

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Here is the issue (with complex numbers):

$ ./clblast_tuner_xgemm -precision 3232
* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -precision 3232 (complex-single) 
    -m 1024 [=default]
    -n 1024 [=default]
    -k 1024 [=default]
    -alpha 2+0.5i [=default]
    -beta 2+0.5i [=default]
    -fraction 2048.000000 [=default]


[==========] Initializing on platform 0 device 0
[==========] Device name: 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile' (OpenCL 1.2 beignet 1.1.2)

[----------] Testing reference Xgemm
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (460 ms) - 1 out of 1

[----------] Testing kernel Xgemm
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (189 ms) - 1 out of 117
[ RUN      ] Running Xgemm
[       OK ] Completed Xgemm (1471 ms) - 2 out of 117
Segmentation fault

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

OK, thanks for running the tuner and showing the output. First thing to check for now is to see whether it is a bug in the compiler or in CLTune or the CLBlast kernels. Because I don't know how to do that properly if I can't re-produce the errors myself, I've added a 'VERBOSE' setting to CLTune. So, could you do the following for me:

  1. Pull the latest version of the development branch of CLTune
  2. Run cmake -DVERBOSE=ON .. to enable 'VERBOSE' mode
  3. Compile and install CLTune
  4. Re-build the CLBlast tuners
  5. Re-run the tuner and post the output here

Perhaps it is not verbose enough yet, but this would be the first step I guess.

Thanks!

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

The answer is Compilation !

$ ./clblast_tuner_xgemm -precision 3232
* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -precision 3232 (complex-single) 
    -m 1024 [=default]
    -n 1024 [=default]
    -k 1024 [=default]
    -alpha 2+0.5i [=default]
    -beta 2+0.5i [=default]
    -fraction 2048.000000 [=default]


[==========] Initializing on platform 0 device 0
[==========] Device name: 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile' (OpenCL 1.2 beignet 1.1.2)

[----------] Testing reference Xgemm
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (461 ms) - 1 out of 1

[----------] Testing kernel Xgemm
[ VERBOSE  ] Computing the permutations of all parameters
[ VERBOSE  ] Exploring configuration (1 out of 117)
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (2495 ms) - 1 out of 117
[ VERBOSE  ] Exploring configuration (2 out of 117)
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (1923 ms) - 2 out of 117
[ VERBOSE  ] Exploring configuration (3 out of 117)
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (1126 ms) - 3 out of 117
[ VERBOSE  ] Exploring configuration (4 out of 117)
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (3122 ms) - 4 out of 117
[ VERBOSE  ] Exploring configuration (5 out of 117)
[ VERBOSE  ] Starting compilation
[ VERBOSE  ] Finished compilation
[ VERBOSE  ] Creating a copy of the output buffer
[ VERBOSE  ] Setting kernel arguments
[ RUN      ] Running Xgemm
[ VERBOSE  ] Launching kernel (1 out of 1 for averaging)
[       OK ] Completed Xgemm (178 ms) - 5 out of 117
[ VERBOSE  ] Exploring configuration (6 out of 117)
[ VERBOSE  ] Starting compilation
Segmentation fault

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

If it helps, here is the output of valgrind (with lots of memory error suppressed):

$ valgrind --tool=memcheck --show-leak-kinds=definite --error-limit=no ./clblast_tuner_xgemm -precision 3232
...
[ VERBOSE  ] Exploring configuration (4 out of 117)
[ VERBOSE  ] Starting compilation
==13154== Invalid write of size 8
==13154==    at 0x858A62E: gbe::Kernel::setSamplerSet(gbe::ir::SamplerSet*) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x85841B1: gbe::Program::buildFromUnit(gbe::ir::Unit const&, std::string&) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x8583F91: gbe::Program::buildFromLLVMFile(char const*, void const*, std::string&, int) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x86FF22C: gbe::genProgramNewFromLLVM(unsigned int, char const*, void const*, void const*, char const*, unsigned long, char*, unsigned long*, int) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x858890F: gbe::programNewFromSource(unsigned int, char const*, unsigned long, char const*, char*, unsigned long*) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x54D27F1: cl_program_build (in /usr/lib64/beignet/libcl.so)
==13154==    by 0x54C6C0B: clBuildProgram (in /usr/lib64/beignet/libcl.so)
==13154==    by 0x529789C: cltune::TunerImpl::RunKernel(std::string const&, cltune::KernelInfo const&, unsigned long, unsigned long) (in /usr/lib64/libcltune.so)
==13154==    by 0x529989D: cltune::TunerImpl::Tune() (in /usr/lib64/libcltune.so)
==13154==    by 0x4177A2: void clblast::Tuner<clblast::TuneXgemm<std::complex<float> >, std::complex<float> >(int, char**) (in /home/thomas/src/CLBlast/build/clblast_tuner_xgemm)
==13154==    by 0x4081DC: main (in /home/thomas/src/CLBlast/build/clblast_tuner_xgemm)
==13154==  Address 0x58 is not stack'd, malloc'd or (recently) free'd
==13154== 
==13154== 
==13154== Process terminating with default action of signal 11 (SIGSEGV)
==13154==  Access not within mapped region at address 0x58
==13154==    at 0x858A62E: gbe::Kernel::setSamplerSet(gbe::ir::SamplerSet*) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x85841B1: gbe::Program::buildFromUnit(gbe::ir::Unit const&, std::string&) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x8583F91: gbe::Program::buildFromLLVMFile(char const*, void const*, std::string&, int) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x86FF22C: gbe::genProgramNewFromLLVM(unsigned int, char const*, void const*, void const*, char const*, unsigned long, char*, unsigned long*, int) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x858890F: gbe::programNewFromSource(unsigned int, char const*, unsigned long, char const*, char*, unsigned long*) (in /usr/lib64/beignet/libgbe.so)
==13154==    by 0x54D27F1: cl_program_build (in /usr/lib64/beignet/libcl.so)
==13154==    by 0x54C6C0B: clBuildProgram (in /usr/lib64/beignet/libcl.so)
==13154==    by 0x529789C: cltune::TunerImpl::RunKernel(std::string const&, cltune::KernelInfo const&, unsigned long, unsigned long) (in /usr/lib64/libcltune.so)
==13154==    by 0x529989D: cltune::TunerImpl::Tune() (in /usr/lib64/libcltune.so)
==13154==    by 0x4177A2: void clblast::Tuner<clblast::TuneXgemm<std::complex<float> >, std::complex<float> >(int, char**) (in /home/thomas/src/CLBlast/build/clblast_tuner_xgemm)
==13154==    by 0x4081DC: main (in /home/thomas/src/CLBlast/build/clblast_tuner_xgemm)
==13154==  If you believe this happened as a result of a stack
==13154==  overflow in your program's main thread (unlikely but
==13154==  possible), you can try to increase the size of the
==13154==  main thread stack using the --main-stacksize= flag.
==13154==  The main thread stack size used in this run was 8388608.
==13154== 
==13154== HEAP SUMMARY:
==13154==     in use at exit: 191,945,013 bytes in 802,573 blocks
==13154==   total heap usage: 23,654,057 allocs, 22,851,484 frees, 2,182,157,488 bytes allocated
==13154== 
==13154== LEAK SUMMARY:
==13154==    definitely lost: 10,988 bytes in 182 blocks
==13154==    indirectly lost: 6,074,124 bytes in 31,357 blocks
==13154==      possibly lost: 15,540,120 bytes in 266,188 blocks
==13154==    still reachable: 170,319,781 bytes in 504,846 blocks
==13154==         suppressed: 0 bytes in 0 blocks
==13154== Rerun with --leak-check=full to see details of leaked memory
==13154== 
==13154== For counts of detected and suppressed errors, rerun with: -v
==13154== Use --track-origins=yes to see where uninitialised values come from
==13154== ERROR SUMMARY: 4243812 errors from 245 contexts (suppressed: 0 from 0)
Segmentation fault

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Yes, so we can indeed conclude this is an issue with Beignet or the Intel drivers.

What you can do is modify src/tuner_impl.cc line 254 and replace

fprintf(stdout, "%s Starting compilation\n", kMessageVerbose.c_str());

with

fprintf(stdout, "%s Starting compilation\n%s\n", kMessageVerbose.c_str(), source.c_str());

Then, copy-paste the faulty kernel and report it to the developers of Beignet, possibly with a small test program that does nothing else than compilation. Note that this kernel can be quite long for GEMM. In the worst-case if this kernel is not valid OpenCL, the Beignet compiler should still report the error instead of crash with a segfault.

Before you do this, I recommend building the latest version of Beignet from the git source repository. And then run the included unit tests, first see if they pass. That's what the developers of Beignet will ask you to do I guess.

Unfortunately Beignet doesn't seem mature enough yet. I've seen some issues myself on Skylake GPUs, mostly with FP16 though.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

I switched to beignet git HEAD and it works:

[----------] Printing best result in database format to stdout
{ "Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile", { {"MWG",64}, {"NWG",64}, {"KWG",32}, {"MDIMC",16}, {"NDIMC",16}, {"MDIMA",16}, {"NDIMB",8}, {"KWI",2}, {"VWM",4}, {"VWN",2}, {"STRM",1}, {"STRN",0}, {"SA",0}, {"SB",1}, {"PRECISION",3232} } }
[ -------> ] 121.6 ms or 17.7 GFLOPS

I think we are done with this issue, thanks a lot for your help!

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

OK, good to hear that a new version of Beignet helped with the Tuner issues. But the original issue was with the tests, right? So I suggest that you pull the latests version of the CLBlast development branch (which includes the tuner results for your device) and re-run the tests. If that goes fine, we can close this issue.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Before pulling new dev HEAD:

51% tests passed, 20 tests failed out of 41

Total Test time (real) =  25.75 sec

The following tests FAILED:
         11 - clblast_test_xgemv (SEGFAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)
         41 - clblast_test_xomatcopy (SEGFAULT)

After pulling new dev HEAD (Updating 61105e3..66908ef):

51% tests passed, 20 tests failed out of 41

Total Test time (real) =  53.28 sec

The following tests FAILED:
         11 - clblast_test_xgemv (SEGFAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)
         41 - clblast_test_xomatcopy (SEGFAULT)

It doesn't seem to help with the unit tests. If thoses tests aren't that important, maybe we can could use a signal handling or child process strategy so that a seg fault in Beignet doesn't crash the whole unit test ...
What do you think ?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Here is the output of all failling tests in case you want to check them:
test_results.txt

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Thanks for the data, I will look into it as soon as I have some time. Quick look tells me again the only failures are for tests which should return an error code. So although not crucial, still it would be good if the error codes were returned correctly. And I am also curious why this happens, since no actual OpenCL kernel should be compiled/executed in that case.

So this is on the git version of Beignet I presume, the one you used to run the tuners successfully?

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

I just checked it: indeed, it only crashes for tests that should return an error code.

I am still not sure what is the cause of this issue, so I added extra printing statements (and std::flush) to the tests, hopefully they will help us locate the source of the error, whether it is in the test code or in one of the tested libraries.

Could you re-run one of those failing tests after pulling in the latest changes from development? I see output like this for example for the GER tests:

./clblast_test_xger -verbose -device 1

* Options given/available:
    -platform 0 [=default]
    -device 1 
    -full_test [false]
    -verbose [true]
    -clblas 0 [=default]
    -cblas 1 [=default]

* Running on OpenCL device 'Iris Pro'.
* Starting tests for the 'SGER' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   . -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major)':
   Testing: m=61 n=61 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] -> .
   Testing: m=61 n=512 lda=512 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] -> .
   Testing: m=61 n=512 lda=512 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] -> .
   Testing: m=61 n=512 lda=512 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] -> .
   Testing: m=61 n=512 lda=512 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] -> .
   Testing: m=61 n=512 lda=512 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=2 incy=7 offa=0 offx=0 offy=0 [CLBlast] -> .
(...)

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Is this still an issue with the newest version of Beignet?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Hi,
I just tried with latest beignet GIT and I don't get any segfault on tuners.
I get a few errors:

[ RUN      ] Running Xger
[   FAILED ] Kernel Xger failed
[   FAILED ]   catched exception: Internal OpenCL error: -54
[  WARNING ] Results differ: L2 norm is 6.41e+06
[   FAILED ] Xger;      0 ms; WGS1 512;   WGS2 1;    WPT 4;PRECISION 3232;
[       OK ] Completed Xgemm (701 ms) - 20 out of 117
device compiler error/warning: Xgemm:(GBE): error: failed in Gen backend.

[   FAILED ] Kernel Xgemm failed
[   FAILED ]   catched exception: device compiler error/warning occurred ^^

[   FAILED ] Xgemm;      0 ms;  MWG 128;  NWG 128;   KWG 32;  MDIMC 8;  NDIMC 8; MDIMA 16; NDIMB 16;    KWI 2;    VWM 8;    VWN 4;   STRM 0;   STRN 1;     SA 1;     SB 0;PRECISION 3232;
[       OK ] Completed Xgemm (445 ms) - 35 out of 117
device compiler error/warning: Xgemm:(GBE): error: failed in Gen backend.

[   FAILED ] Kernel Xgemm failed
[   FAILED ]   catched exception: device compiler error/warning occurred ^^

[   FAILED ] Xgemm;      0 ms;  MWG 128;  NWG 128;   KWG 32;  MDIMC 8;  NDIMC 8; MDIMA 16; NDIMB 32;    KWI 2;    VWM 1;    VWN 1;   STRM 0;   STRN 1;     SA 0;     SB 0;PRECISION 3232;
[       OK ] Completed Xgemm (272 ms) - 40 out of 117
device compiler error/warning: Xgemm:(GBE): error: failed in Gen backend.

[   FAILED ] Kernel Xgemm failed
[   FAILED ]   catched exception: device compiler error/warning occurred ^^

[   FAILED ] Xgemm;      0 ms;  MWG 128;  NWG 128;   KWG 16;  MDIMC 8;  NDIMC 8; MDIMA 16; NDIMB 16;    KWI 2;    VWM 1;    VWN 8;   STRM 1;   STRN 0;     SA 1;     SB 1;PRECISION 3232;
[       OK ] Completed Xgemm (293 ms) - 76 out of 117
device compiler error/warning: Xgemm:(GBE): error: failed in Gen backend.

[   FAILED ] Kernel Xgemm failed
[   FAILED ]   catched exception: device compiler error/warning occurred ^^

[   FAILED ] Xgemm;      0 ms;  MWG 128;  NWG 128;   KWG 32;  MDIMC 8;  NDIMC 8;  MDIMA 8; NDIMB 32;    KWI 8;    VWM 4;    VWN 1;   STRM 0;   STRN 0;     SA 1;     SB 1;PRECISION 3232;
[       OK ] Completed Xgemm (327 ms) - 85 out of 117
device compiler error/warning: Xgemm:(GBE): error: failed in Gen backend.

[   FAILED ] Kernel Xgemm failed
[   FAILED ]   catched exception: device compiler error/warning occurred ^^

[   FAILED ] Xgemm;      0 ms;  MWG 128;  NWG 128;   KWG 16;  MDIMC 8;  NDIMC 8;  MDIMA 8;  NDIMB 8;    KWI 2;    VWM 8;    VWN 2;   STRM 0;   STRN 0;     SA 1;     SB 1;PRECISION 3232;

I'm not sure whether its an issue or not. What do you think ?

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Incorrect results during tuning are automatically filtered out, so you don't have to worry. Well, as long as not all tuning results fail of course :-)

I also had some problems myself with Beignet, it seems it is not 100%. error: failed in Gen backend is not the type of error you hope to see from your compiler.

What about the tests, do they work?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

I'll test on the Haswell laptop asap.
I run my linux on a Broadwell laptop today:

$ clinfo 
Platform #0
  Name:                                  Intel Gen OCL Driver
  Version:                               OpenCL 1.2 beignet 1.2 (git-8bc5d28)

  Device #0
    Name:                                Intel(R) HD Graphics 5500 BroadWell U-Processor GT2
    Type:                                GPU
    Version:                             OpenCL 1.2 beignet 1.2 (git-8bc5d28)
    Global memory size:                  3 GB 888 MB 
    Local memory size:                   64 kB 
    Max work group size:                 512
    Max work item sizes:                 (512, 512, 512)

I get this results:

The following tests FAILED:
          2 - clblast_test_xscal (SEGFAULT)
         11 - clblast_test_xgemv (SEGFAULT)
         12 - clblast_test_xgbmv (OTHER_FAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         17 - clblast_test_xsbmv (OTHER_FAULT)
         18 - clblast_test_xspmv (OTHER_FAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         20 - clblast_test_xtbmv (OTHER_FAULT)
         21 - clblast_test_xtpmv (OTHER_FAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         32 - clblast_test_xspr2 (Failed)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)
         41 - clblast_test_xomatcopy (SEGFAULT)
Errors while running CTest

There is also a lot of compiler errors in the tuners.
I'll go through the thread again. Maybe BroadWell have similar issues :(

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Thanks for the test. Perhaps these issues are in the FP16 versions of the kernels only? What happens if you for example look at the full output of clblast_test_xscal? On my Skylake Intel GPU I also see a lot of errors with FP16 - again something that's not well implemented in Beignet. Perhaps I should disable inclusion of those tests when running make test or make alltests.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Back to Haswell (I didn't have time to run verbose tests on the Broadwell laptop)

The following tests FAILED:
         11 - clblast_test_xgemv (SEGFAULT)
         13 - clblast_test_xhemv (SEGFAULT)
         16 - clblast_test_xsymv (SEGFAULT)
         19 - clblast_test_xtrmv (SEGFAULT)
         22 - clblast_test_xger (SEGFAULT)
         23 - clblast_test_xgeru (SEGFAULT)
         24 - clblast_test_xgerc (SEGFAULT)
         25 - clblast_test_xher (SEGFAULT)
         27 - clblast_test_xher2 (SEGFAULT)
         29 - clblast_test_xsyr (SEGFAULT)
         31 - clblast_test_xsyr2 (SEGFAULT)
         33 - clblast_test_xgemm (SEGFAULT)
         34 - clblast_test_xsymm (SEGFAULT)
         35 - clblast_test_xhemm (SEGFAULT)
         36 - clblast_test_xsyrk (SEGFAULT)
         37 - clblast_test_xherk (SEGFAULT)
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (SEGFAULT)
         40 - clblast_test_xtrmm (SEGFAULT)
         41 - clblast_test_xomatcopy (SEGFAULT)
Errors while running CTest

I did not rebuild anything, there is quite less errors on Haswell.
Verbose tests on the way.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024
$ ./clblast_test_xgemv -verbose true

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [true]
    -clblas 0 [=default]
    -cblas 1 [=default]

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SGEMV' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 111 (regular)':
   Testing: m=61 n=61 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=1 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=2 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=2 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=1 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast]Segmentation fault

Valgrind output at the segfault: libcl again

   Testing: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast]==4621== Invalid read of size 8
==4621==    at 0x6C79BC5: clWaitForEvents (in /usr/lib64/beignet/libcl.so)
==4621==    by 0x435DF8: clblast::TestXgemv<float>::RunRoutine(clblast::Arguments<float> const&, clblast::Buffers<float>&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x48AEDA: clblast::TestBlas<float, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x43E9D8: unsigned long clblast::RunTests<clblast::TestXgemv<float>, float, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x4232B1: main (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==  Address 0x18 is not stack'd, malloc'd or (recently) free'd
==4621== 
==4621== 
==4621== Process terminating with default action of signal 11 (SIGSEGV)
==4621==  Access not within mapped region at address 0x18
==4621==    at 0x6C79BC5: clWaitForEvents (in /usr/lib64/beignet/libcl.so)
==4621==    by 0x435DF8: clblast::TestXgemv<float>::RunRoutine(clblast::Arguments<float> const&, clblast::Buffers<float>&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x48AEDA: clblast::TestBlas<float, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x43E9D8: unsigned long clblast::RunTests<clblast::TestXgemv<float>, float, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==    by 0x4232B1: main (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==4621==  If you believe this happened as a result of a stack
==4621==  overflow in your program's main thread (unlikely but
==4621==  possible), you can try to increase the size of the
==4621==  main thread stack using the --main-stacksize= flag.
==4621==  The main thread stack size used in this run was 8388608.
==4621== 
==4621== HEAP SUMMARY:
==4621==     in use at exit: 14,574,819 bytes in 96,493 blocks
==4621==   total heap usage: 855,203 allocs, 758,710 frees, 98,981,746 bytes allocated
==4621== 
==4621== LEAK SUMMARY:
==4621==    definitely lost: 2 bytes in 2 blocks
==4621==    indirectly lost: 0 bytes in 0 blocks
==4621==      possibly lost: 471,500 bytes in 10,317 blocks
==4621==    still reachable: 14,103,317 bytes in 86,174 blocks
==4621==         suppressed: 0 bytes in 0 blocks
==4621== Rerun with --leak-check=full to see details of leaked memory
==4621== 
==4621== For counts of detected and suppressed errors, rerun with: -v
==4621== Use --track-origins=yes to see where uninitialised values come from
==4621== ERROR SUMMARY: 254323 errors from 181 contexts (suppressed: 0 from 0)
Segmentation fault

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Could you perhaps try again with the latest Beignet and CLBlast? CLBlast now has the tuning parameters for your devices included, perhaps that changes something. If not, please post the latest output again and I'll re-investigate what could be the cause. Thanks!

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

with beignet 2c1f246 (current HEAD) and clblast b1929d8 (current dev HEAD): identical results.
It till crashes at libcl:

   Testing: m=61 n=61 lda=512 incx=7 incy=2 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=61 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=61 lda=512 incx=7 incy=7 offa=0 offx=0 offy=0 [CLBlast] [CPU BLAS] -> :
   Testing: m=61 n=512 lda=61 incx=1 incy=1 offa=0 offx=0 offy=0 [CLBlast]==30716== Invalid read of size 8
==30716==    at 0x6C802C5: clWaitForEvents (in /usr/lib64/beignet/libcl.so)
==30716==    by 0x435EA8: clblast::TestXgemv<float>::RunRoutine(clblast::Arguments<float> const&, clblast::Buffers<float>&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==30716==    by 0x48B11A: clblast::TestBlas<float, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==30716==    by 0x43EA88: unsigned long clblast::RunTests<clblast::TestXgemv<float>, float, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==30716==    by 0x423361: main (in /home/thomas/src/CLBlast/build/clblast_test_xgemv)
==30716==  Address 0x78 is not stack'd, malloc'd or (recently) free'd
==30716== 
==30716== 
==30716== Process terminating with default action of signal 11 (SIGSEGV)
==30716==  Access not within mapped region at address 0x78

A pointer with a value of 0x78 is obviously wrong.
Maybe there is something with auto status = Gemv(args.layout, ... , &event); in test/routines/level2/xgemv.hpp setting a wrong value to event.
Then clWaitForEvents(1, &event); would fail.

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Indeed, you are right. Your valgrind trace helped me locate the issue. It crashes indeed on clWaitForEvents with an invalid event. And earlier on I observed already that your crash happens only in particular cases:

indeed, it only crashes for tests that should return an error code.

Taking both observations together: in case the CLBlast routine doesn't finish correctly (it doesn't return StatusCode::kSuccess) its event is also not allocated and thus waiting for it is wrong. I have now guarded all the clWaitForEvents statements in the tests and samples against this. Also, I've added clReleaseEvent to also fix the memory leak.

This is fixed in the development branch in commit d595a8e. Can you test it again?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

First clblast_test_xsyrk:

$ ./clblast_test_xsyrk

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [false]
    -clblas 0 [=default]
    -cblas 1 [=default]

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SSYRK' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   ::::--::-X-X---X
   Error rate 12.9%: n=64 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  37.5%: 6 passed / 7 skipped / 3 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::---X---X
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  50.0%: 8 passed / 6 skipped / 2 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::---X---X
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  50.0%: 8 passed / 6 skipped / 2 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   ::::--::-X-X---X
   Error rate 12.8%: n=64 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  37.5%: 6 passed / 7 skipped / 3 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   66 test(s) passed
   52 test(s) skipped
   10 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'DSYRK' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'CSYRK' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::---:---:
   Pass rate  62.5%: 10 passed / 6 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::--::-:-:---:
   Pass rate  56.2%: 9 passed / 7 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   76 test(s) passed
   52 test(s) skipped
   0 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'ZSYRK' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'HSYRK' routine.
* All tests skipped: Unsupported precision

I tried with CLBlas with little different results:

$ ./clblast_test_xsyrk -clblas 1 -cblas 0

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [false]
    -clblas 1 
    -cblas 0 

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SSYRK' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   XXXX..::.X.X...X
   Error rate 57.1%: n=7 k=7 lda=7 ldc=7 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldc=7 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  56.2%: 9 passed / 0 skipped / 7 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::..::.:.:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::...X...X
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  87.5%: 14 passed / 0 skipped / 2 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::...X...X
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  87.5%: 14 passed / 0 skipped / 2 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   XXXX..::.X.X...X
   Error rate 57.1%: n=7 k=7 lda=7 ldc=7 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldc=7 offa=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.8%: n=64 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.9%: n=64 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  56.2%: 9 passed / 0 skipped / 7 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::..::.:.:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   182 test(s) passed
   0 test(s) skipped
   18 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'DSYRK' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'CSYRK' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   XXXX..XX.:.:...:
   Error rate 30.6%: n=7 k=7 lda=7 ldc=7 offa=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldc=7 offa=0 offc=0 
   Error rate 18.4%: n=7 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 12.2%: n=7 k=64 lda=64 ldc=7 offa=0 offc=0 
   Error rate 14.3%: n=7 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  62.5%: 10 passed / 0 skipped / 6 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::..::.:.:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::...:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 111 (regular)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   XXXX..XX.:.:...:
   Error rate 34.7%: n=7 k=7 lda=7 ldc=7 offa=0 offc=0 
   Error rate 20.4%: n=7 k=7 lda=7 ldc=64 offa=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldc=7 offa=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldc=64 offa=0 offc=0 
   Error rate 36.7%: n=7 k=64 lda=64 ldc=7 offa=0 offc=0 
   Error rate 32.7%: n=7 k=64 lda=64 ldc=64 offa=0 offc=0 
   Pass rate  62.5%: 10 passed / 0 skipped / 6 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::..::.:.:...:
   Pass rate 100.0%: 16 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 112 (transposed)':
   .........
   Pass rate 100.0%: 9 passed / 0 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   188 test(s) passed
   0 test(s) skipped
   12 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'ZSYRK' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'HSYRK' routine.
* All tests skipped: Unsupported precision

I think that's a tricky one.
The two others (clblast_test_xsyr2k and clblast_test_xher2k) are memory issues (free error and segfault), I'll try to locate it.

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

clblast_test_xher2k crashes in CBLAS:

   Testing: n=7 k=7 lda=64 ldb=7 ldc=7 offa=0 offb=0 offc=0 [CLBlast] [CPU BLAS]==21485== Invalid write of size 4
==21485==    at 0x6C7C15F: cblas_cher2k (in /usr/lib64/libgslcblas.so.0.0.0)
==21485==    by 0x429BBD: clblast::cblasXher2k(CBLAS_ORDER, CBLAS_UPLO, CBLAS_TRANSPOSE, unsigned long, unsigned long, std::complex<float>, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, float, std::vector<std::complex<float>, std::allocator<std::complex<float> > >&, unsigned long, unsigned long) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x43314C: clblast::TestXher2k<std::complex<float>, float>::RunReference2(clblast::Arguments<float> const&, clblast::Buffers<std::complex<float> >&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x47DB76: clblast::TestBlas<std::complex<float>, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x437E38: unsigned long clblast::RunTests<clblast::TestXher2k<std::complex<float>, float>, std::complex<float>, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x4206A1: main (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==  Address 0x98aad7c is 28 bytes after a block of size 32 in arena "client"
==21485== 
==21485== Invalid read of size 4
==21485==    at 0x6C7C191: cblas_cher2k (in /usr/lib64/libgslcblas.so.0.0.0)
==21485==    by 0x429BBD: clblast::cblasXher2k(CBLAS_ORDER, CBLAS_UPLO, CBLAS_TRANSPOSE, unsigned long, unsigned long, std::complex<float>, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, float, std::vector<std::complex<float>, std::allocator<std::complex<float> > >&, unsigned long, unsigned long) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x43314C: clblast::TestXher2k<std::complex<float>, float>::RunReference2(clblast::Arguments<float> const&, clblast::Buffers<std::complex<float> >&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x47DB76: clblast::TestBlas<std::complex<float>, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x437E38: unsigned long clblast::RunTests<clblast::TestXher2k<std::complex<float>, float>, std::complex<float>, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x4206A1: main (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==  Address 0x98aad78 is 24 bytes after a block of size 32 in arena "client"
==21485== 
==21485== Invalid write of size 4
==21485==    at 0x6C7C196: cblas_cher2k (in /usr/lib64/libgslcblas.so.0.0.0)
==21485==    by 0x429BBD: clblast::cblasXher2k(CBLAS_ORDER, CBLAS_UPLO, CBLAS_TRANSPOSE, unsigned long, unsigned long, std::complex<float>, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, std::vector<std::complex<float>, std::allocator<std::complex<float> > > const&, unsigned long, unsigned long, float, std::vector<std::complex<float>, std::allocator<std::complex<float> > >&, unsigned long, unsigned long) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x43314C: clblast::TestXher2k<std::complex<float>, float>::RunReference2(clblast::Arguments<float> const&, clblast::Buffers<std::complex<float> >&, clblast::Queue&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x47DB76: clblast::TestBlas<std::complex<float>, float>::TestRegular(std::vector<clblast::Arguments<float>, std::allocator<clblast::Arguments<float> > >&, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x437E38: unsigned long clblast::RunTests<clblast::TestXher2k<std::complex<float>, float>, std::complex<float>, float>(int, char**, bool, std::string const&) (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==    by 0x4206A1: main (in /home/thomas/src/CLBlast/build/clblast_test_xher2k)
==21485==  Address 0x98aad78 is 24 bytes after a block of size 32 in arena "client"
==21485== 

valgrind: m_mallocfree.c:304 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 96, hi = 1106500021.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.

clblast_test_xher2k -clblas 1 -cblas 0 passes without errors.

clblast_test_xsyr2k -clblas 1 -cblas 0 fails 20 tests without crashing.

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Good to see that most tests now pass. I'll look into the few failure cases in a couple of days. Thanks for the feedback and data!

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

I have just fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with specific tuning parameters. Could you perhaps try the tests again and see if those are now successful?

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

clblast_test_xsyrk now works !

95% tests passed, 2 tests failed out of 41

Total Test time (real) =  51.54 sec

The following tests FAILED:
         38 - clblast_test_xsyr2k (SEGFAULT)
         39 - clblast_test_xher2k (Failed)

./clblast_test_xher2k -clblas 1 -cblas 0 passes.
./clblast_test_xsyr2k -clblas 1 -cblas 0 fails

There is something realy wrong with CBLAS calls:
./clblast_test_xher2k -clblas 0 -cblas 1 has a bad memory crash (libc memory corruption):

valgrind: m_mallocfree.c:304 (get_bszB_as_is): Assertion 'bszB_lo == bszB_hi' failed.
valgrind: Heap block lo/hi size mismatch: lo = 128, hi = 13972549651098155320.
This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata.  If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away.  Please try that before reporting this as a bug.

./clblast_test_xsyr2k -clblas 0 -cblas 1 has test failures then seg faults:

X:X::X:--24316-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--24316-- si_code=80;  Faulting address: 0x0;  sp: 0x807008dd0

valgrind: the 'impossible' happened:
   Killed by fatal signal

from clblast.

OursDesCavernes avatar OursDesCavernes commented on June 30, 2024

Details for xsyr2k:

$ ./clblast_test_xsyr2k -clblas 1 -cblas 0

* Options given/available:
    -platform 0 [=default]
    -device 0 [=default]
    -full_test [false]
    -verbose [false]
    -clblas 1 
    -cblas 0 

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'SSYR2K' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   XXXXXXXX......X:.:.:.:.:.......:
   Error rate 57.1%: n=7 k=7 lda=7 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 24.5%: n=7 k=64 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Pass rate  71.9%: 23 passed / 0 skipped / 9 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::::::......::.:.:.:.:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   XXXXXXXX......X:.:.:.:.:.......:
   Error rate 57.1%: n=7 k=7 lda=7 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=7 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 57.1%: n=7 k=7 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 24.5%: n=7 k=64 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Pass rate  71.9%: 23 passed / 0 skipped / 9 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::::::......::.:.:.:.:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   454 test(s) passed
   0 test(s) skipped
   18 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'DSYR2K' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'CSYR2K' routine. Legend:
   : -> Test produced correct results
   . -> Test returned the correct error code
   X -> Test produced incorrect results
   / -> Test returned an incorrect error code
   \ -> Test not executed: OpenCL-kernel compilation error
   o -> Test not executed: Unsupported precision
   - -> Test not completed: Reference CBLAS doesn't output error codes
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 111 (regular)':
   XXXXXXXX......XX.:.:.:.:.......:
   Error rate 36.7%: n=7 k=7 lda=7 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 34.7%: n=7 k=7 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=64 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=64 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Pass rate  68.8%: 22 passed / 0 skipped / 10 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 111 (regular)':
   ::::::::......::.:.:.:.:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 121 (upper) 112 (transposed)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 121 (upper) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '101 (row-major) 122 (lower) 112 (transposed)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '101 (row-major) 122 (lower) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 111 (regular)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 111 (regular)':
   ::::::::::::::::.......:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 111 (regular)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 121 (upper) 112 (transposed)':
   XXXXXXXX......XX.:.:.:.:.......:
   Error rate 36.7%: n=7 k=7 lda=7 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=7 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=7 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=7 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=7 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=64 lda=64 ldb=64 ldc=7 offa=0 offb=0 offc=0 
   Error rate 36.7%: n=7 k=64 lda=64 ldb=64 ldc=64 offa=0 offb=0 offc=0 
   Pass rate  68.8%: 22 passed / 0 skipped / 10 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 121 (upper) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Testing 'regular behaviour' for '102 (col-major) 122 (lower) 112 (transposed)':
   ::::::::......::.:.:.:.:.......:
   Pass rate 100.0%: 32 passed / 0 skipped / 0 failed
* Testing 'invalid buffer sizes' for '102 (col-major) 122 (lower) 112 (transposed)':
   ...........................
   Pass rate 100.0%: 27 passed / 0 skipped / 0 failed
* Completed all test-cases for this routine. Results:
   452 test(s) passed
   0 test(s) skipped
   20 test(s) failed

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'ZSYR2K' routine.
* All tests skipped: Unsupported precision

* Running on OpenCL device 'Intel(R) HD Graphics Haswell Ultrabook GT2 Mobile'.
* Starting tests for the 'HSYR2K' routine.
* All tests skipped: Unsupported precision

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

Thanks again for testing. I am now trying to reproduce it myself. I am also on Beignet, but with a Skylake GPU. I am testing with the tuning parameters for your Haswell GPU, so that's as close as I can get to your set-up. Below are my results for syr2k:

  • ./clblast_test_xsyr2k -platform 1 -clblas 1 -cblas 0: Same 18 & 20 failures as you
  • ./clblast_test_xsyr2k -platform 1 -clblas 0 -cblas 1: No failures. Conclusion: there is a bug in clBLAS and not in CLBlast.

And for her2k:

  • ./clblast_test_xher2k -platform 1 -clblas 1 -cblas 0: No failures.
  • ./clblast_test_xher2k -platform 1 -clblas 0 -cblas 1: No failures.

I also tried to run under valgrind but I didn't observe anything interesting. So in conclusion I don't know if I can help you any further. Perhaps there is a genuine bug in the CBLAS library you're using? Or perhaps there is still an issue in Beignet or in the Intel drivers for your GPU?

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

I just updated from on my system from Beignet 1.2 to 1.2.1 and I see a lot improvements, especially related to half-precision (fp16). I also re-run the above commands, and I no longer see any errors. Could you perhaps also re-run the tests with the latest Beignet?

from clblast.

CNugteren avatar CNugteren commented on June 30, 2024

I am closing this issue, since I think most of the bugs are now fixed. The latest version of the code contains SYRK/SYR2K/HERK/HER2K and TRMM fixes, so that should be good. And then Beignet 1.2.1 should fix any remaining issues. If this is note the case, please open a new issue with a report of which test(s) fail.

from clblast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.