doe300 / vc4cl Goto Github PK
View Code? Open in Web Editor NEWOpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models
License: MIT License
OpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models
License: MIT License
when I write code like
int i;
int j;
int k=i * j;
I can't get correct results
but if I write
int i;
int k=i * 10;
The result is correct, what's wrong with the multiplication between two variables?
Hello @doe300!
User Gavinmc42 on Raspberry Pi forum (https://www.raspberrypi.org/forums/viewtopic.php?f=37&t=216940&p=1408611#p1408600) recommended me to contact you.
I'm a hardware developer, and our team did a stereoscopic camera based on Raspberry Pi Compute Module 3 (http://stereopi.com).
For OpenCV and ROS implementation Depth Map is a critical thing, so I'm looking for ways to increase Pi performance in this task. So I have some questions:
Regards,
Eugene
Hi I am working on RPI support on the fork of DarkNet at https://github.com/sowson/darknet in Makefile I enabled RPI=1 and disabled OPENCV=0 after installation of VC4CL and make the project I am trying to run it but without success. One of OpenCL program did not compile in runtime others are fine.
root@raspberrypi:~/cifar# ../darknet/darknet classifier predict cfg/cifar.data cfg/cifar_small_test.cfg backup/cifar_small.backup data/cifar/test/4882_frog.png
Device ID: 0
Device name: VideoCore IV GPU
Device vendor: Broadcom
Device opencl availability: OpenCL 1.2 VC4CL 0.4
Device opencl used: 0.4
Device double precision: NO
Device max group size: 12
Device address bits: 32
opencl_load: could not compile. error: CL_UNKNOWN_ERROR
CL_PROGRAM_BUILD_LOG:
[W] Thu Aug 9 08:06:40 2018: Register conflict resolver has exceeded its maximum rounds, there might still be errors!
[E] Thu Aug 9 08:06:40 2018: Error assigning local to register: %call59.%b.1
[E] Thu Aug 9 08:06:40 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)+0xac [0x7695f918]
[E] Thu Aug 9 08:06:40 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : vc4c::qpu_asm::GraphColoring::toRegisterMap() const+0x150 [0x76a4d3b8]
[E] Thu Aug 9 08:06:40 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::qpu_asm::CodeGenerator::generateInstructions(vc4c::Method&)+0x474 [0x76a3c900]
[E] Thu Aug 9 08:06:40 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : +0x4983ac [0x769613ac]
[E] Thu Aug 9 08:06:40 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x498588 [0x76961588]
[E] Thu Aug 9 08:06:40 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x49942c [0x7696242c]
[E] Thu Aug 9 08:06:40 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76966214]
[E] Thu Aug 9 08:06:40 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocatorvc4c::Method* > >(std::vector<vc4c::Method*, std::allocatorvc4c::Method* > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda()#1}::operator()() const+0xc8 [0x76964a1c]
[E] Thu Aug 9 08:06:40 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocatorvc4c::Method* > >(std::vector<vc4c::Method*, std::allocatorvc4c::Method* > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x769698b8]
[E] Thu Aug 9 08:06:40 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x769508e4]
[E] Thu Aug 9 08:06:40 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x486a70 [0x7694fa70]
[E] Thu Aug 9 08:06:40 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : +0x48735c [0x7695035c]
[E] Thu Aug 9 08:06:40 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x4872c8 [0x769502c8]
[E] Thu Aug 9 08:06:40 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x4872a0 [0x769502a0]
[E] Thu Aug 9 08:06:40 2018: (15) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x763df9dc]
[E] Thu Aug 9 08:06:40 2018: Background worker threw error: Label/Register Mapping: There are erroneous register-associations!
[E] Thu Aug 9 08:06:40 2018: While running worker task: CodeGenerator
[E] Thu Aug 9 08:06:40 2018: Compiler threw exception: Label/Register Mapping: There are erroneous register-associations!
CODE: attribute((noinline))` float get_pixel_kernel(__global float image, int w, int h, int x, int y, int c); attribute((noinline)) float4 rgb_to_hsv_kernel(float4 rgb); attribute((noinline)) float4 hsv_to_rgb_kernel(float4 hsv); attribute((noinline)) float bilinear_interpolate_kernel(__global float image, int w, int h, float x, float y, int c); attribute((noinline)) float get_pixel_kernel(__global float image, int w, int h, int x, int y, int c) { if(x < 0 || x >= w || y < 0 || y >= h) return 0; return image[x + w(y + ch)]; } attribute((noinline)) float4 rgb_to_hsv_kernel(float4 rgb) { float r = rgb.x; float g = rgb.y; float b = rgb.z; float h, s, v; float max = (r > g) ? ( (r > b) ? r : b) : ( (g > b) ? g : b); float min = (r < g) ? ( (r < b) ? r : b) : ( (g < b) ? g : b); float delta = max - min; v = max; if(max == 0){ s = 0; h = -1; }else{ s = delta/max; if(r == max){ h = (g - b) / delta; } else if (g == max) { h = 2 + (b - r) / delta; } else { h = 4 + (r - g) / delta; } if (h < 0) h += 6; } return (float4) (h, s, v, 0.0); } attribute((noinline)) float4 hsv_to_rgb_kernel(float4 hsv) { float h = hsv.x; float s = hsv.y; float v = hsv.z; float r, g, b; float f, p, q, t; if (s == 0) { r = g = b = v; } else { int index = (int) floor(h); f = h - index; p = v(1-s); q = v*(1-sf); t = v(1-s*(1-f)); if(index == 0){ r = v; g = t; b = p; } else if(index == 1){ r = q; g = v; b = p; } else if(index == 2){ r = p; g = v; b = t; } else if(index == 3){ r = p; g = q; b = v; } else if(index == 4){ r = t; g = p; b = v; } else { r = v; g = p; b = q; } } r = (r < 0) ? 0 : ((r > 1) ? 1 : r); g = (g < 0) ? 0 : ((g > 1) ? 1 : g); b = (b < 0) ? 0 : ((b > 1) ? 1 : b); return (float4)(r, g, b, 0.0); } attribute((noinline)) float bilinear_interpolate_kernel(__global float image, int w, int h, float x, float y, int c) { int ix = (int) floor(x); int iy = (int) floor(y); float dx = x - ix; float dy = y - iy; float val1 = ((1-dy) * (1-dx) * get_pixel_kernel(image, w, h, ix, iy, c)); float val2 = (dy * (1-dx) * get_pixel_kernel(image, w, h, ix, iy+1, c)); float val3 = ((1-dy) * dx * get_pixel_kernel(image, w, h, ix+1, iy, c)); float val4 = (dy * dx * get_pixel_kernel(image, w, h, ix+1, iy+1, c)); float val = val1 + val2 + val3 + val4; return val; } __kernel void levels_image_kernel(__global float image, __global float rand, int batch, int w, int h, int train, float saturation, float exposure, float translate, float scale, float shift) { int size = batch * w * h; int id = (get_group_id(0) + get_group_id(1)get_num_groups(0)) * get_local_size(0) + get_local_id(0); if(id >= size) return; int x = id % w; id /= w; int y = id % h; id /= h; float rshift = rand[0]; float gshift = rand[1]; float bshift = rand[2]; float r0 = rand[8id + 0]; float r1 = rand[8id + 1]; float r2 = rand[8id + 2]; float r3 = rand[8id + 3]; saturation = r0*(saturation - 1) + 1; saturation = (r1 > .5) ? 1./saturation : saturation; exposure = r2*(exposure - 1) + 1; exposure = (r3 > .5) ? 1./exposure : exposure; size_t offset = id * h * w * 3; image += offset; float r = image[x + w*(y + h0)]; float g = image[x + w(y + h1)]; float b = image[x + w(y + h2)]; float4 rgb = (float4)(r,g,b, 0.0); if(train){ float4 hsv = rgb_to_hsv_kernel(rgb); hsv.y = saturation; hsv.z = exposure; rgb = hsv_to_rgb_kernel(hsv); } else { shift = 0; } image[x + w(y + h0)] = rgb.xscale + translate + (rshift - .5)shift; image[x + w(y + h1)] = rgb.yscale + translate + (gshift - .5)shift; image[x + w(y + h2)] = rgb.zscale + translate + (bshift - .5)shift; } __kernel void forward_crop_layer_kernel(__global float input, __global float rand, int size, int c, int h, int w, int crop_height, int crop_width, int train, int flip, float angle, __global float output) { int id = (get_group_id(0) + get_group_id(1)get_num_groups(0)) * get_local_size(0) + get_local_id(0); if(id >= size) return; float cx = w/2.; float cy = h/2.; int count = id; int j = id % crop_width; id /= crop_width; int i = id % crop_height; id /= crop_height; int k = id % c; id /= c; int b = id; float r4 = rand[8b + 4]; float r5 = rand[8b + 5]; float r6 = rand[8b + 6]; float r7 = rand[8b + 7]; float dw = (w - crop_width)r4; float dh = (h - crop_height)r5; flip = (flip && (r6 > .5)); angle = 2angler7 - angle; if(!train){ dw = (w - crop_width)/2.; dh = (h - crop_height)/2.; flip = 0; angle = 0; } input += whcb; float x = (flip!=0) ? w - dw - j - 1 : j + dw; float y = i + dh; float rx = cos(angle)(x-cx) - sin(angle)(y-cy) + cx; float ry = sin(angle)(x-cx) + cos(angle)(y-cy) + cy; output[count] = bilinear_interpolate_kernel(input, w, h, rx, ry, k); }
terminate called without an active exception
Aborted
EDIT: attribute is an __attribute__
Can you help? in CODE section above is the code I am trying to compile/build. Thanks!
Somehow circleci build failed.
Probably, getting URL by curl
in get vc4c url
make empty json file, then get vc4cl-stdlib package
failed because of empty json file.
https://circleci.com/gh/doe300/VC4CL/28
Hello.
I was able to build VC4CLStdLib, VC4C and VC4CL.
I installed both VC4CLStdLib and VC4C by:
$ sudo dpkg -i /tmp/vc4cl-stdlib.deb
$ sudo dpkg -i /tmp/vc4c.deb
But, when I install VC4CL by
$ sudo dpkg -i /tmp/vc4cl.deb
I get:
$ dependency problems prevent configuration of vc4cl:
$ vc4cl depends on opencl-c-headers; however Package opencl-c-headers is not installed.
This is strange because I've installed it by:
$ sudo apt-get install opencl-headers
Any ideas?
Thanks in advance.
Hello.
I'm optimizing a RANSAC algorithm using Videocore IV and VC4CL.
However, the performance of the GPU for my kernel is worst than CPU's.
So, I need some help with some performance issues.
Since VC4CL only allows 12 work group size, I'm using a three dimensional work group (12x12x12 -> global) with local size (1x1x1 -> local). Is this a good option or should, for example, (6x6x6 -> global) and (2x2x2 -> local) be better?
Is there any data type that improves the performance, like uint8, uint16, float8, float16, etc?...
I notice that sums, multiplications and divisions take some time. Is better to perform this operations using only integers or should I use floats?
Thanks in advance.
Currently, the problem is
nomaddo@nomaddo-AS:~/idein/VC4CL$ circleci build
====>> Spin up Environment
Build-agent version 0.0.4704-deba4df (2018-02-21T21:01:29+0000)
Starting container nomaddo/cross-rpi:0.1
Using build environment variables:
BASH_ENV=/tmp/.bash_env-localbuild-1519377596
CI=true
CIRCLECI=true
CIRCLE_BRANCH=master
CIRCLE_BUILD_NUM=
CIRCLE_JOB=build
CIRCLE_NODE_INDEX=0
CIRCLE_NODE_TOTAL=1
CIRCLE_REPOSITORY_URL=http://github.com/doe300/VC4CL.git
CIRCLE_SHA1=68be86b590b85eb37862b01f50753a9ff5af3b48
CIRCLE_SHELL_ENV=/tmp/.bash_env-localbuild-1519377596
CIRCLE_WORKING_DIRECTORY=~/project
====>> Checkout code
#!/bin/sh
mkdir -p /root/project && cp -r /tmp/_circleci_local_build_repo/. /root/project
====>> get vc4c url
#!/bin/bash -eo pipefail
curl "https://circleci.com/api/v1.1/project/github/doe300/VC4C/latest/artifacts?branch=master&filter=successful" --output /tmp/dump
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1089 100 1089 0 0 1289 0 --:--:-- --:--:-- --:--:-- 1288
====>> get vc4cl-stdlib package
#!/bin/bash -eo pipefail
wget -O /tmp/vc4cl-stdlib.deb $(python .circleci/get_url.py "vc4cl-stdlib-" "/tmp/dump")
--2018-02-23 09:19:59-- https://76-106166771-gh.circle-artifacts.com/0/root/project/vc4cl-stdlib-0.4-Linux.deb
Resolving 76-106166771-gh.circle-artifacts.com (76-106166771-gh.circle-artifacts.com)... 34.194.4.99, 34.198.74.173, 52.44.239.229, ...
Connecting to 76-106166771-gh.circle-artifacts.com (76-106166771-gh.circle-artifacts.com)|34.194.4.99|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-debian-package]
Saving to: '/tmp/vc4cl-stdlib.deb'
/tmp/vc4cl-stdlib.d [ <=> ] 103.90K 294KB/s in 0.4s
2018-02-23 09:20:01 (294 KB/s) - '/tmp/vc4cl-stdlib.deb' saved [106390]
====>> get vc4c package
#!/bin/bash -eo pipefail
wget -O /tmp/vc4c.deb $(python .circleci/get_url.py "vc4c-" "/tmp/dump")
--2018-02-23 09:20:01-- https://76-106166771-gh.circle-artifacts.com/0/root/project/vc4c-0.4-Linux.deb
Resolving 76-106166771-gh.circle-artifacts.com (76-106166771-gh.circle-artifacts.com)... 52.72.204.202, 52.44.239.229, 34.198.74.173, ...
Connecting to 76-106166771-gh.circle-artifacts.com (76-106166771-gh.circle-artifacts.com)|52.72.204.202|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-debian-package]
Saving to: '/tmp/vc4c.deb'
/tmp/vc4c.deb [ <=> ] 7.84M 4.21MB/s in 1.9s
2018-02-23 09:20:04 (4.21 MB/s) - '/tmp/vc4c.deb' saved [8223320]
====>> unpack dependencies
#!/bin/bash -eo pipefail
dpkg-deb -x /tmp/vc4cl-stdlib.deb / && dpkg-deb -x /tmp/vc4c.deb /
====>> configure
#!/bin/bash -eo pipefail
cmake . -DCROSS_COMPILE=ON -DBUILD_TESTING=ON -DREGISTER_POKE_KERNELS=ON
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
-- Cross compiling for Raspbian with compiler: /usr/bin/arm-linux-gnueabihf-g++
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc
-- Check for working C compiler: /usr/bin/arm-linux-gnueabihf-gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++
-- Check for working CXX compiler: /usr/bin/arm-linux-gnueabihf-g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - found
CMake Error at /usr/share/cmake-3.7/Modules/FindPackageHandleStandardArgs.cmake:138 (message):
Could NOT find OpenCL (missing: OpenCL_LIBRARY) (found version "2.0")
Call Stack (most recent call first):
/usr/share/cmake-3.7/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.7/Modules/FindOpenCL.cmake:127 (find_package_handle_standard_args)
CMakeLists.txt:66 (find_package)
-- Configuring incomplete, errors occurred!
See also "/root/project/CMakeFiles/CMakeOutput.log".
Error: Exited with code 1
Step failed
====>> Uploading artifacts
Uploading /root/project/build/libVC4CL.so.0.4 to root/project/build/libVC4CL.so.0.4
No artifact files found at /root/project/build/libVC4CL.so.0.4
====>> Uploading artifacts
Uploading /root/project/build/test/TestVC4CL to root/project/build/test/TestVC4CL
No artifact files found at /root/project/build/test/TestVC4CL
====>> Uploading artifacts
Uploading /root/project/vc4cl-0.4-Linux.deb to root/project/vc4cl-0.4-Linux.deb
No artifact files found at /root/project/vc4cl-0.4-Linux.deb
Task failed
I installed opencl-headers (see https://github.com/nomaddo/cross-rpi/blob/master/Dockerfile#L36).
I am not sure why failed. @doe300 Do you have any idea?
P.S.
Now, In my dockerfile, I give up building SPIRV-LLVM and download it from archive.
Auto-building of docker images works. You can push nomaddo/cross-rpi
and try images.
I compiled and installed VC4CL on a RPI v3 with Debian stretch. Running clinfo under root works.
When I run my own OpenCL code (mcxcl: http://github.com/fangq/mcxcl) using vc4cl, I got a "No matching precompiler available" error, see log below:
==============================================================================
- code name: [Vanilla MCXCL] compiled with OpenCL version [1]
- compiled with: [RNG] Logistic-Lattice [Seed Length] 5
initializing streams ... terminate called without an active exception
init complete : 0 ms
Building kernel with option: -cl-mad-enable
Kernel build log:
[E] Sat Jan 27 00:08:04 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xac [0x7668e23c]
[E] Sat Jan 27 00:08:04 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Compiler::compile(std::istream&, std::ostream&, vc4c::Configuration, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)+0x1e0 [0x76690608]
[E] Sat Jan 27 00:08:04 2018: (3) /usr/local/lib/libVC4CL.so : +0x4a630 [0x76983630]
[E] Sat Jan 27 00:08:04 2018: (4) /usr/local/lib/libVC4CL.so : vc4cl::Program::compile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, vc4cl::object_wrapper<vc4cl::Program>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, vc4cl::object_wrapper<vc4cl::Program> > > > const&, void (*)(_cl_program*, void*), void*)+0xd8 [0x76984bc4]
[E] Sat Jan 27 00:08:04 2018: (5) /usr/local/lib/libVC4CL.so : VC4CL_clCompileProgram+0xe88 [0x76987830]
[E] Sat Jan 27 00:08:04 2018: (6) /usr/local/lib/libVC4CL.so : VC4CL_clBuildProgram+0x1a8 [0x76988468]
[E] Sat Jan 27 00:08:04 2018: (7) ../../bin/mcxcl() [0x13834]
[E] Sat Jan 27 00:08:04 2018: Compiler threw exception: Pre-compilation: No matching precompiler available!
MCX ERROR(2):Error: Failed to build program executable! in unit mcx_host.cpp:450
Aborted
my opencl kernel was tested on nvidia/intel/amd ocl implementations, I am wondering if the above error is related to my setting or compatibility of the kernel to vc4cl.
please let me know if you have any suggestions how to solve this issue. thanks
To reproduce this issue, you may follow the below commands as root
git clone https://github.com/fangq/mcxcl.git
cd mcxcl/src
make clean all
cd ../example/benchmark
LD_LIBRARY_PATH=/usr/local/lib ./run_benchmark1.sh -n 1e5
Hello!
I installed all the setup correctly (clinfo runs fine) but when I compile the tests in VC4C/example folder I always get precompile errors.
The compilation command I'm using is:
VC4C --hex -o /dev/null ./example/file.cl
I'm running this in a Raspberry Pi 3B.
Any ideas?
Thanks in advance.
I'm currently trying to assert the status of expectation we can have on this setup for DeepSpeech, relying on TensorFlow with ComputeCpp. I have been able to cross-build the driver, and most of the TestVC4C
do run (properly or not). That means, I can see clang
doing its job and compile some cl stuff.
The GPU is also visible by computecpp_info
.
Now, I'm trying to run our code on top of that. So far, it's not being very successfull, but in an unexpected way: as documented in codeplaysoftware/computecpp-sdk#117 (comment), ComputeCpp does see the GPU, and makes use of it. But then, monitoring the system, it's sitting with the deepspeech
process at 100%.
I don't see that much of clang
running, but I did spot some process llvm-spirv /tmp/vc4c-EgnXeW /dev/stdin
being ran. The file /tmp/vc4c-EgnXeW
seems to be non-zero size. But no error when running, so I don't know if there's something going on.
The OpenCL kernels might be big (too big for the current limitations? I'm not sure how to check that), and/or the project might be still too young?
As a comparison, we are able to run with the same stack on the Intel Neo driver on my laptop (i7-8650U) using the GPU. The first run of the intel driver does compile the OpenCL code and can cache it on-disk, and this ~220 secs to compile.
Resulting cl_cache
for Intel is:
$ LC_ALL=C ll cl_cache/
total 16M
drwxr-xr-x 1 alex alex 700 May 23 16:31 .
drwxr-xr-x 1 alex alex 222 May 23 16:28 ..
-rw-r--r-- 1 alex alex 73K May 23 16:31 34dcc67199396f96.cl_cache
-rw-r--r-- 1 alex alex 734K May 23 16:30 3b459da76e3bb7c4.cl_cache
-rw-r--r-- 1 alex alex 6.7M May 23 16:30 3d76036abef34a33.cl_cache
-rw-r--r-- 1 alex alex 206K May 23 16:31 66637feca75f4aba.cl_cache
-rw-r--r-- 1 alex alex 443K May 23 16:30 6ad895b3f2af8633.cl_cache
-rw-r--r-- 1 alex alex 728K May 23 16:31 6d252b514c3b944f.cl_cache
-rw-r--r-- 1 alex alex 165K May 23 16:30 8a505eb1718a2ecd.cl_cache
-rw-r--r-- 1 alex alex 425K May 23 16:30 98e4482263ced159.cl_cache
-rw-r--r-- 1 alex alex 523K May 23 16:30 9e55731fa2c7901e.cl_cache
-rw-r--r-- 1 alex alex 124K May 23 16:30 a493ba4fcff0b7d1.cl_cache
-rw-r--r-- 1 alex alex 122K May 23 16:29 b1a69f78813be187.cl_cache
-rw-r--r-- 1 alex alex 108K May 23 16:31 d2e0ea42ff6b24ac.cl_cache
-rw-r--r-- 1 alex alex 344K May 23 16:31 ec50bb7158952ee2.cl_cache
-rw-r--r-- 1 alex alex 5.2M May 23 16:31 f4146d630607b157.cl_cache
I've let run deepspeech
with the VC4
driver run for ~120m without any visible output or error: is it possible our code is too much compute intensive for now, and it's expected to take that much time ? Or could there be some silent error happening and breaking something ?
As much as I could read of the docs / wiki available, I could not find anything (e.g., env variable) that could be used to get a bit more of informations at runtime. I'm a bit relunctant to try a debug build, considering how slow things are already with a release build, but if that can provide useful feedback, I'd be glad to give it a try.
When I use the device-side sin()
function, no matter how I try to isolate it or go around it, I'm getting results that don't make sense. Here's my original kernel:
void kernel decomposeamp(global real1* stateProb, global real1* stateAngle, constant bitCapInt* bitCapIntPtr, global cmplx* nStateVec)
{
bitCapInt ID, Nthreads, lcv;
ID = get_global_id(0);
Nthreads = get_global_size(0);
bitCapInt maxQPower = bitCapIntPtr[0];
real1 angle;
for (lcv = ID; lcv < maxQPower; lcv += Nthreads) {
angle = stateAngle[lcv];
nStateVec[lcv] = sqrt(stateProb[lcv]) * sin((float2)(angle + M_PI_2_F, angle));
}
}
In the cases I'm testing, stateProb
should always be an array of values that are (at least extremely close to) exactly 1 or 0. If I comment out the multiplication by the sin
of the vector, my components all do come out 1 or 0, via the sqrt
. However, the values returned by sin
seem to be not even close, like cos^2(x) + sin^(x) > 2.7. In fact, the vectorized sin()
above should be equivalent to components of (float2)(cos(angle), sin(angle))
by trigonometric identities, but I'm finding that the resulting norm (sum of the squared components) is around 2.7 or larger for a stateProb[lcv]
of 1.
This doesn't seem to be limited to the vectorized version of sin()
, either, if I replace this with two separate calls to construct two components separately. It also happens if I switch M_PI_2_F
out with a known good value for pi/2.
The kernel is not particularly complicated, and if I implement it host-side at the same point in the code, and use the host sin()
instead, it works correctly.
(Thank you for the awesome compiler, by the way.) Any idea what's going on? Am I doing something wrong?
In my OpenCL program (just call test_mul24
in test_int.cl
), the exception is raised as follows:
terminate called without an active exception
Aborted
I am not sure what happened. But I see ~thread
is called in __run_exit_handlers
in _start
.
Some threads are unfinished (related to #3?).
My program is as follows. It is straightforward as OpenCL program.
Probably other program raise the same exception.....
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <assert.h>
#include <math.h>
#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif
#define MAX_SOURCE_SIZE (0x100000)
#define N 1024
cl_int a[N];
cl_int b[N];
int main() {
char fileName[] = "./test_int.cl";
FILE *fp = fopen(fileName, "r");
if (!fp) {
fprintf(stderr, "Failed to load kernel\n");
exit(1);
}
char * source_str = (char*)malloc(MAX_SOURCE_SIZE);
size_t source_size = fread(source_str, 1, MAX_SOURCE_SIZE, fp);
fclose(fp);
cl_platform_id platform_id = NULL;
cl_uint ret_num_platforms;
cl_int ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
cl_device_id device_id = NULL;
cl_uint ret_num_devices;
ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices);
cl_context context = clCreateContext(NULL, 1, &device_id, NULL, NULL, &ret);
cl_command_queue command_queue = clCreateCommandQueue(context, device_id, 0, &ret);
for (int i = 0; i < N; i++)
a[i] = N;
cl_mem memobj = clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(cl_float) * N, NULL, &ret);
clEnqueueWriteBuffer(command_queue, memobj, CL_TRUE, 0, sizeof(cl_float) * N, a, 0, NULL, NULL);
cl_program program = clCreateProgramWithSource(context, 1, (const char **)&source_str, (const size_t *)&source_size, &ret);
ret = clBuildProgram(program, 1, &device_id, NULL, NULL, NULL);
cl_kernel kernel = clCreateKernel(program, "test_mul24", &ret);
cl_int arg0 = 2;
ret = clSetKernelArg(kernel, 0, sizeof(cl_int), (void*)&arg0);
ret = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void*)&memobj);
size_t global_item_size[] = { N };
ret = clEnqueueNDRangeKernel (command_queue, kernel, 1, NULL,
global_item_size, NULL, 0, NULL, NULL);
ret = clFinish(command_queue);
ret = clFlush(command_queue);
ret = clEnqueueReadBuffer(command_queue, memobj, CL_TRUE, 0, sizeof(cl_float) * N, b, 0, NULL, NULL);
ret = clReleaseKernel(kernel);
ret = clReleaseProgram(program);
ret = clReleaseMemObject(memobj);
ret = clReleaseCommandQueue(command_queue);
ret = clReleaseContext(context);
free(source_str);
return 0;
}
Hello! I have followed the README file to build the VC4CLStdLib, & VC4C & VC4CL. And I have got something really confusing.
I didn't make any change of the CMakeList.txt (in the top dir) but just turned off the Multi-Thread option in the VC4C.
And then I run
cmake .
make
make install
And the output of TestVC4CL
Test 'TestSystem::testGetSystemInfo()' failed!
Suite: TestSystem
File: TestSystem.cpp
Line: 28
Failure: Got 11, expected 2
Test 'TestSystem::testGetSystemInfo()' failed!
Suite: TestSystem
File: TestSystem.cpp
Line: 30
Failure: Got 14, expected 4
Test 'TestSystem::testGetSystemInfo()' failed!
Suite: TestSystem
File: TestSystem.cpp
Line: 32
Failure: Got 14, expected 3
Suite 'TestSystem' finished, 0/1 successful (0%) in 236 microseconds (0.236 ms).
./build/test/TestVC4CL: symbol lookup error: ./build/test/TestVC4CL: undefined symbol: clGetPlatformIDs
the VC4C CMake output:
-- VC4CL standard library headers found: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.h
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
PING github.com (13.250.177.223) 56(84) bytes of data.
--- github.com ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
CMake Warning at CMakeLists.txt:96 (message):
Building in off-line mode, some dependencies might not be up-to-date!
-- CLang compiler found: /usr/bin/clang-3.9
-- Using enhanced OpenCL features of Clang 3.9+!
-- LLVM-link found: /usr/bin/llvm-link
-- Compiling LLVM library front-end with LLVM in version 3.9.1 located in '/usr/lib/llvm-3.9/lib'
-- found clang-format: /usr/bin/clang-format
-- build deb package...
-- Debian package expects VC4CL standard library PCH to be located in: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.h.pch
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pi/TestOpenCL/VC4C
the VC4CL CMake output:
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29")
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
PING github.com (13.250.177.223) 56(84) bytes of data.
--- github.com ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
CMake Warning at CMakeLists.txt:77 (message):
Building in off-line mode, some dependencies might not be up-to-date!
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - found
-- Found OpenCL: /usr/lib/arm-linux-gnueabihf/libOpenCL.so (found version "2.0")
-- Found headers for OpenCL up to version 2.0 located in /usr/include
-- VC4C library found: /home/pi/TestOpenCL/VC4C/build/libVC4CC.so
-- VC4C compiler header found at: /home/pi/TestOpenCL/VC4C/include/VC4C.h
-- Building with ICD support
-- Checking for one of the modules 'ocl-icd>=1.3'
-- Found Khronos ICD Loader in version 2.2.11 in /usr/lib/arm-linux-gnueabihf
-- Enabling register-poking to run kernels
-- found clang-format: /usr/bin/clang-format
-- build deb package...
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pi/TestOpenCL/VC4CL
And the clinfo(which is from the git) output:
Number of platforms 0
And then
sudo ./build/test/TestVC4C
I got the output below
./example/fft2_2.cl
[E] Thu Jun 7 11:01:32 2018: Errors in precompilation:
[E] Thu Jun 7 11:01:32 2018: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-KQBI2v: error: input module is broken!
[E] Thu Jun 7 11:01:32 2018: (1) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&)+0xf4 [0x76b776a8]
[E] Thu Jun 7 11:01:32 2018: (2) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : +0x6a27d0 [0x76e017d0]
[E] Thu Jun 7 11:01:32 2018: (3) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkLLVMModules(std::vector, std::allocator > >&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x364 [0x76e02770]
[E] Thu Jun 7 11:01:32 2018: (4) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkInStdlibModule(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x78 [0x76e01f9c]
[E] Thu Jun 7 11:01:32 2018: (5) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), void (*)(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x60 [0x76e04ef0]
[E] Thu Jun 7 11:01:32 2018: (6) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76e03fb0]
[E] Thu Jun 7 11:01:32 2018: (7) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0xd0 [0x76e02e9c]
[E] Thu Jun 7 11:01:32 2018: (8) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x5c [0x76e05130]
[E] Thu Jun 7 11:01:32 2018: (9) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76e03d00]
[E] Thu Jun 7 11:01:32 2018: (10) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Precompiler::run(std::unique_ptr >&, vc4c::SourceType, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > >)+0x4ac [0x76e09af4]
[E] Thu Jun 7 11:01:32 2018: (11) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Precompiler::precompile(std::istream&, std::unique_ptr >&, vc4c::Configuration, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > > const&, vc4c::Optional, std::allocator > >)+0x158 [0x76e07a80]
[E] Thu Jun 7 11:01:32 2018: (12) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Compiler::compile(std::istream&, std::ostream&, vc4c::Configuration, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > > const&)+0xc4 [0x76b79898]
[E] Thu Jun 7 11:01:32 2018: (13) ./build/test/TestVC4C : RegressionTest::testRegression(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend)+0xb0 [0xbea78]
[E] Thu Jun 7 11:01:32 2018: (14) ./build/test/TestVC4C : Test::Suite::TestMethod::TestMethod, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend>(std::__cxx11::basic_string, std::allocator > const&, void (Test::Suite::*)(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend), std::__cxx11::basic_string, std::allocator > const, std::__cxx11::basic_string, std::allocator > const, vc4c::Frontend const)::{lambda(Test::Suite*)#1}::operator()(Test::Suite*) const+0xbc [0xd8afc]
[E] Thu Jun 7 11:01:32 2018: (15) ./build/test/TestVC4C : std::_Function_handler, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend>(std::__cxx11::basic_string, std::allocator > const&, void (Test::Suite::*)(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend), std::__cxx11::basic_string, std::allocator > const, std::__cxx11::basic_string, std::allocator > const, vc4c::Frontend const)::{lambda(Test::Suite*)#1}>::_M_invoke(std::_Any_data const&, Test::Suite*&&)+0x3c [0xda9b4]
[E] Thu Jun 7 11:01:32 2018: (16) /home/pi/TestOpenCL/VC4C/build/cpptest-lite/src/cpptest-lite-project-build/libcpptest-lite.so.1.1.2 : Test::Suite::runTestMethod(Test::Suite::TestMethod const&)+0xc0 [0x767486ac]
[E] Thu Jun 7 11:01:32 2018: (17) /home/pi/TestOpenCL/VC4C/build/cpptest-lite/src/cpptest-lite-project-build/libcpptest-lite.so.1.1.2 : Test::Suite::run(Test::Output&, bool)+0x98 [0x76748958]
[E] Thu Jun 7 11:01:32 2018: (18) ./build/test/TestVC4C : Test::runSuites(int, char**)+0xae0 [0xdd0e4]
[E] Thu Jun 7 11:01:32 2018: (19) ./build/test/TestVC4C : main+0xc54 [0xde024]
[E] Thu Jun 7 11:01:32 2018: (20) /lib/arm-linux-gnueabihf/libc.so.6 : __libc_start_main+0x114 [0x73db4678]
[E] Thu Jun 7 11:01:32 2018: Compiler threw exception: Pre-compilation: Error in precompilation: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-KQBI2v: error: input module is broken!
Test-method 'RegressionTest::testRegression("./example/fft2_2.cl", "", 0)' failed with exception!
Exception: Pre-compilation: Error in precompilation: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-KQBI2v: error: input module is broken!
Errno: 0
Error: Success
./example/fibonacci.cl
[E] Thu Jun 7 11:01:33 2018: Errors in precompilation:
[E] Thu Jun 7 11:01:33 2018: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-oqcFrK: error: input module is broken!
[E] Thu Jun 7 11:01:33 2018: (1) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&)+0xf4 [0x76b776a8]
[E] Thu Jun 7 11:01:33 2018: (2) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : +0x6a27d0 [0x76e017d0]
[E] Thu Jun 7 11:01:33 2018: (3) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkLLVMModules(std::vector, std::allocator > >&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x364 [0x76e02770]
[E] Thu Jun 7 11:01:33 2018: (4) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkInStdlibModule(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x78 [0x76e01f9c]
[E] Thu Jun 7 11:01:33 2018: (5) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), void (*)(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x60 [0x76e04ef0]
[E] Thu Jun 7 11:01:33 2018: (6) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76e03fb0]
[E] Thu Jun 7 11:01:33 2018: (7) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0xd0 [0x76e02e9c]
[E] Thu Jun 7 11:01:33 2018: (8) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x5c [0x76e05130]
[E] Thu Jun 7 11:01:33 2018: (9) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76e03d00]
[E] Thu Jun 7 11:01:33 2018: (10) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Precompiler::run(std::unique_ptr >&, vc4c::SourceType, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > >)+0x4ac [0x76e09af4]
[E] Thu Jun 7 11:01:33 2018: (11) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Precompiler::precompile(std::istream&, std::unique_ptr >&, vc4c::Configuration, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > > const&, vc4c::Optional, std::allocator > >)+0x158 [0x76e07a80]
[E] Thu Jun 7 11:01:33 2018: (12) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::Compiler::compile(std::istream&, std::ostream&, vc4c::Configuration, std::__cxx11::basic_string, std::allocator > const&, vc4c::Optional, std::allocator > > const&)+0xc4 [0x76b79898]
[E] Thu Jun 7 11:01:33 2018: (13) ./build/test/TestVC4C : RegressionTest::testRegression(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend)+0xb0 [0xbea78]
[E] Thu Jun 7 11:01:33 2018: (14) ./build/test/TestVC4C : Test::Suite::TestMethod::TestMethod, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend>(std::__cxx11::basic_string, std::allocator > const&, void (Test::Suite::*)(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend), std::__cxx11::basic_string, std::allocator > const, std::__cxx11::basic_string, std::allocator > const, vc4c::Frontend const)::{lambda(Test::Suite*)#1}::operator()(Test::Suite*) const+0xbc [0xd8afc]
[E] Thu Jun 7 11:01:33 2018: (15) ./build/test/TestVC4C : std::_Function_handler, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend>(std::__cxx11::basic_string, std::allocator > const&, void (Test::Suite::*)(std::__cxx11::basic_string, std::allocator >, std::__cxx11::basic_string, std::allocator >, vc4c::Frontend), std::__cxx11::basic_string, std::allocator > const, std::__cxx11::basic_string, std::allocator > const, vc4c::Frontend const)::{lambda(Test::Suite*)#1}>::_M_invoke(std::_Any_data const&, Test::Suite*&&)+0x3c [0xda9b4]
[E] Thu Jun 7 11:01:33 2018: (16) /home/pi/TestOpenCL/VC4C/build/cpptest-lite/src/cpptest-lite-project-build/libcpptest-lite.so.1.1.2 : Test::Suite::runTestMethod(Test::Suite::TestMethod const&)+0xc0 [0x767486ac]
[E] Thu Jun 7 11:01:33 2018: (17) /home/pi/TestOpenCL/VC4C/build/cpptest-lite/src/cpptest-lite-project-build/libcpptest-lite.so.1.1.2 : Test::Suite::run(Test::Output&, bool)+0x98 [0x76748958]
[E] Thu Jun 7 11:01:33 2018: (18) ./build/test/TestVC4C : Test::runSuites(int, char**)+0xae0 [0xdd0e4]
[E] Thu Jun 7 11:01:33 2018: (19) ./build/test/TestVC4C : main+0xc54 [0xde024]
[E] Thu Jun 7 11:01:33 2018: (20) /lib/arm-linux-gnueabihf/libc.so.6 : __libc_start_main+0x114 [0x73db4678]
[E] Thu Jun 7 11:01:36 2018: Compiler threw exception: Pre-compilation: Error in precompilation: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-oqcFrK: error: input module is broken!
Test-method 'RegressionTest::testRegression("./example/fibonacci.cl", "", 0)' failed with exception!
Exception: Pre-compilation: Error in precompilation: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-oqcFrK: error: input module is broken!
Errno: 0
Error: Success
./example/hello_world.cl
[E] Thu Jun 7 11:01:37 2018: Errors in precompilation:
[E] Thu Jun 7 11:01:37 2018: Expected no forward declarations!
!3 = !{}
incorrect number of operands in llvm.ident metadata
!0 = !{i32 0, i32 1}
/usr/bin/llvm-link: /tmp/vc4c-Ky3hHy: error: input module is broken!
[E] Thu Jun 7 11:01:37 2018: (1) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator > const&)+0xf4 [0x76b776a8]
[E] Thu Jun 7 11:01:37 2018: (2) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : +0x6a27d0 [0x76e017d0]
[E] Thu Jun 7 11:01:37 2018: (3) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkLLVMModules(std::vector, std::allocator > >&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x364 [0x76e02770]
[E] Thu Jun 7 11:01:37 2018: (4) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : vc4c::precompilation::linkInStdlibModule(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x78 [0x76e01f9c]
[E] Thu Jun 7 11:01:37 2018: (5) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), void (*)(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x60 [0x76e04ef0]
[E] Thu Jun 7 11:01:37 2018: (6) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76e03fb0]
[E] Thu Jun 7 11:01:37 2018: (7) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0xd0 [0x76e02e9c]
[E] Thu Jun 7 11:01:37 2018: (8) /home/pi/TestOpenCL/VC4C/build/libVC4CC.so.1.2 : std::_Function_handler&&, std::__cxx11::basic_string, std::allocator > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), std::function
Hello everyone! Was wondering if anyone has tried to use BOINC with GPU compute on raspberry pi?
The ability to have this function would stupidly improve the computing power, and make cheap energy efficient available for distributed computing projects that could GREATLY help the scientific field.
Some applications leak device memory, e.g. OpenCL CTS test buffers/test_buffers buffer_copy
.
Possible reasons:
Need to make sure, all resources (esp. device memory) are freed when the application exits!
NOTE: this could have something to do with the background-thread sometimes being aborted (instead of exiting normally)
[W] Tue Jun 19 18:10:46 2018: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail!
[W] Tue Jun 19 18:10:46 2018: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail!
[W] Tue Jun 19 18:10:46 2018: 64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail!
[E] Tue Jun 19 18:11:23 2018: Failed to in-line or intrinsify function-call: <16 x f16> %call31 = <16 x f16> madDhS_S_(<16 x f16> %splat.splat, <16 x f16> %vecinit29, <16 x f16> %splat.splat)
[W] Tue Jun 19 18:11:23 2018: Candidates:
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_local_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_group_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z14get_local_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_local_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z13get_global_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_local_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_local_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_local_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_local_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_global_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z15get_global_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_global_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_global_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_global_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_global_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v1(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: f32 _Z3madfff(f32 %a, f32 %b, f32 %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v2(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <2 x f32> _Z3madfS_S_(<2 x f32> %a, <2 x f32> %b, <2 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v4(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <4 x f32> _Z3madfS_S_(<4 x f32> %a, <4 x f32> %b, <4 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v8(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <8 x f32> _Z3madfS_S_(<8 x f32> %a, <8 x f32> %b, <8 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v16(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <16 x f32> _Z3madfS_S_(<16 x f32> %a, <16 x f32> %b, <16 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v1(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v2(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v4(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v8(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v16(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v1(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v2(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v4(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v8(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v16(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v1(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v2(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v4(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v8(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v16(i32* %ptr, i32 %_A
[E] Tue Jun 19 18:11:23 2018: Failed to in-line or intrinsify function-call: <8 x f16> %call15 = <8 x f16> madDhS_S_(<8 x f16> %splat.splat, <8 x f16> %vecinit13, <8 x f16> %splat.splat)
[W] Tue Jun 19 18:11:23 2018: Candidates:
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_local_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_group_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z14get_local_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_local_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z13get_global_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_local_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_local_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_local_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_local_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_global_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z15get_global_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_global_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_global_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_global_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_global_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v1(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: f32 _Z3madfff(f32 %a, f32 %b, f32 %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v2(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <2 x f32> _Z3madfS_S_(<2 x f32> %a, <2 x f32> %b, <2 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v4(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <4 x f32> _Z3madfS_S_(<4 x f32> %a, <4 x f32> %b, <4 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v8(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <8 x f32> _Z3madfS_S_(<8 x f32> %a, <8 x f32> %b, <8 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v16(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <16 x f32> _Z3madfS_S_(<16 x f32> %a, <16 x f32> %b, <16 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v1(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v2(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v4(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v8(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v16(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v1(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v2(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v4(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v8(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v16(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v1(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v2(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v4(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v8(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v16(i32* %ptr, i32 %_A
[E] Tue Jun 19 18:11:23 2018: Failed to in-line or intrinsify function-call: <4 x f16> %call7 = <4 x f16> madDhS_S_(<4 x f16> %splat.splat, <4 x f16> %vecinit5, <4 x f16> %splat.splat)
[W] Tue Jun 19 18:11:23 2018: Candidates:
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_local_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_group_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z14get_local_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_local_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z13get_global_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_local_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_local_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_local_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_local_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_global_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z15get_global_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_global_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_global_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_global_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_global_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v1(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: f32 _Z3madfff(f32 %a, f32 %b, f32 %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v2(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <2 x f32> _Z3madfS_S_(<2 x f32> %a, <2 x f32> %b, <2 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v4(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <4 x f32> _Z3madfS_S_(<4 x f32> %a, <4 x f32> %b, <4 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v8(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <8 x f32> _Z3madfS_S_(<8 x f32> %a, <8 x f32> %b, <8 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v16(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <16 x f32> _Z3madfS_S_(<16 x f32> %a, <16 x f32> %b, <16 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v1(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v2(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v4(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v8(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v16(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v1(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v2(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v4(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v8(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v16(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v1(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v2(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v4(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v8(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v16(i32* %ptr, i32 %_A
[E] Tue Jun 19 18:11:23 2018: Failed to in-line or intrinsify function-call: <2 x f16> %call3 = <2 x f16> madDhS_S_(<2 x f16> %splat.splat, <2 x f16> %vecinit1, <2 x f16> %splat.splat)
[W] Tue Jun 19 18:11:23 2018: Candidates:
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_local_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_group_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z14get_local_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z12get_local_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: i32 _Z13get_global_idj(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_local_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_local_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_local_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_local_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v1_global_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: i32 _Z15get_global_sizej(i32 %dim
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v2_global_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v4_global_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v8_global_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void global_bandwidth_v16_global_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v1(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: f32 _Z3madfff(f32 %a, f32 %b, f32 %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v2(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <2 x f32> _Z3madfS_S_(<2 x f32> %a, <2 x f32> %b, <2 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v4(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <4 x f32> _Z3madfS_S_(<4 x f32> %a, <4 x f32> %b, <4 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v8(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <8 x f32> _Z3madfS_S_(<8 x f32> %a, <8 x f32> %b, <8 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_sp_v16(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:23 2018: <16 x f32> _Z3madfS_S_(<16 x f32> %a, <16 x f32> %b, <16 x f32> %c
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v1(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v2(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v4(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v8(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_hp_v16(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v1(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v2(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v4(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v8(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_dp_v16(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v1(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v2(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v4(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v8(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:23 2018: void compute_integer_v16(i32* %ptr, i32 %_A
[E] Tue Jun 19 18:11:23 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:23 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x6cd3a0 [0x76aad3a0]
[E] Tue Jun 19 18:11:23 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPMArea::toWriteSetup(vc4c::DataType const&) const+0x154 [0x76ab1af4]
[E] Tue Jun 19 18:11:23 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPM::insertWriteVPM(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::periphery::VPMArea const*, bool, vc4c::Value const&)+0xd4 [0x76aafeac]
[E] Tue Jun 19 18:11:23 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::insertWriteDMA(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::Value const&, bool)+0xb0 [0x76aaf0b4]
[E] Tue Jun 19 18:11:23 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x669020 [0x76a49020]
[E] Tue Jun 19 18:11:23 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6696a0 [0x76a496a0]
[E] Tue Jun 19 18:11:23 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::mapMemoryAccess(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x3b4 [0x76a4acb0]
[E] Tue Jun 19 18:11:23 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x22c [0x76a580dc]
[E] Tue Jun 19 18:11:23 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:23 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:23 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:23 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:23 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:23 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:23 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:23 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:23 2018: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:23 2018: (19) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:23 2018: (20) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:23 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:23 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x677344 [0x76a57344]
[E] Tue Jun 19 18:11:23 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), void (*)(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x70 [0x76a5c390]
[E] Tue Jun 19 18:11:23 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0x76a5bd00]
[E] Tue Jun 19 18:11:23 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x677558 [0x76a57558]
[E] Tue Jun 19 18:11:23 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x38c [0x76a5823c]
[E] Tue Jun 19 18:11:23 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:23 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:23 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:23 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:23 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:23 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:23 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:23 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:23 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:23 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:23 2018: (17) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Failed to in-line or intrinsify function-call: f16 %call2 = f16 madDhDhDh(f16 %conv1, f16 %conv, f16 %conv1)
[W] Tue Jun 19 18:11:24 2018: Candidates:
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v1_local_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: i32 _Z12get_group_idj(i32 %dim
[W] Tue Jun 19 18:11:24 2018: i32 _Z14get_local_sizej(i32 %dim
[W] Tue Jun 19 18:11:24 2018: i32 _Z12get_local_idj(i32 %dim
[W] Tue Jun 19 18:11:24 2018: i32 _Z13get_global_idj(i32 %dim
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v2_local_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v4_local_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v8_local_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v16_local_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v1_global_offset(f32* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: i32 _Z15get_global_sizej(i32 %dim
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v2_global_offset(<2 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v4_global_offset(<4 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v8_global_offset(<8 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void global_bandwidth_v16_global_offset(<16 x f32>* %A, f32* %B
[W] Tue Jun 19 18:11:24 2018: void compute_sp_v1(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:24 2018: f32 _Z3madfff(f32 %a, f32 %b, f32 %c
[W] Tue Jun 19 18:11:24 2018: void compute_sp_v2(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:24 2018: <2 x f32> _Z3madfS_S_(<2 x f32> %a, <2 x f32> %b, <2 x f32> %c
[W] Tue Jun 19 18:11:24 2018: void compute_sp_v4(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:24 2018: <4 x f32> _Z3madfS_S_(<4 x f32> %a, <4 x f32> %b, <4 x f32> %c
[W] Tue Jun 19 18:11:24 2018: void compute_sp_v8(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:24 2018: <8 x f32> _Z3madfS_S_(<8 x f32> %a, <8 x f32> %b, <8 x f32> %c
[W] Tue Jun 19 18:11:24 2018: void compute_sp_v16(f32* %ptr, f32 %_A
[W] Tue Jun 19 18:11:24 2018: <16 x f32> _Z3madfS_S_(<16 x f32> %a, <16 x f32> %b, <16 x f32> %c
[W] Tue Jun 19 18:11:24 2018: void compute_hp_v1(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:24 2018: void compute_hp_v2(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:24 2018: void compute_hp_v4(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:24 2018: void compute_hp_v8(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:24 2018: void compute_hp_v16(f16* %ptr, f32 %_B
[W] Tue Jun 19 18:11:24 2018: void compute_dp_v1(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_dp_v2(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_dp_v4(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_dp_v8(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_dp_v16(f64* %ptr, f64 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_integer_v1(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_integer_v2(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_integer_v4(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_integer_v8(i32* %ptr, i32 %_A
[W] Tue Jun 19 18:11:24 2018: void compute_integer_v16(i32* %ptr, i32 %_A
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x677344 [0x76a57344]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), void (*)(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x70 [0x76a5c390]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0x76a5bd00]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x677558 [0x76a57558]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x38c [0x76a5823c]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: Normalizer: Not normalized instruction found: <2 x f16> %call3 = <2 x f16> madDhS_S_(<2 x f16> %splat.splat, <2 x f16> %vecinit1, <2 x f16> %splat.splat)
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x677344 [0x76a57344]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), void (*)(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x70 [0x76a5c390]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0x76a5bd00]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x677558 [0x76a57558]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x38c [0x76a5823c]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: Normalizer: Not normalized instruction found: <4 x f16> %call7 = <4 x f16> madDhS_S_(<4 x f16> %splat.splat, <4 x f16> %vecinit5, <4 x f16> %splat.splat)
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x6cd3a0 [0x76aad3a0]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPMArea::toWriteSetup(vc4c::DataType const&) const+0x154 [0x76ab1af4]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPM::insertWriteVPM(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::periphery::VPMArea const*, bool, vc4c::Value const&)+0xd4 [0x76aafeac]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::insertWriteDMA(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::Value const&, bool)+0xb0 [0x76aaf0b4]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x669020 [0x76a49020]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6696a0 [0x76a496a0]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::mapMemoryAccess(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x3b4 [0x76a4acb0]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x22c [0x76a580dc]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (19) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (20) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x6cd3a0 [0x76aad3a0]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPMArea::toWriteSetup(vc4c::DataType const&) const+0x154 [0x76ab1af4]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPM::insertWriteVPM(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::periphery::VPMArea const*, bool, vc4c::Value const&)+0xd4 [0x76aafeac]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::insertWriteDMA(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::Value const&, bool)+0xb0 [0x76aaf0b4]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x669020 [0x76a49020]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6696a0 [0x76a496a0]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::mapMemoryAccess(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x3b4 [0x76a4acb0]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x22c [0x76a580dc]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (19) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (20) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x6cd3a0 [0x76aad3a0]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPMArea::toWriteSetup(vc4c::DataType const&) const+0x154 [0x76ab1af4]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPM::insertWriteVPM(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::periphery::VPMArea const*, bool, vc4c::Value const&)+0xd4 [0x76aafeac]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::insertWriteDMA(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::Value const&, bool)+0xb0 [0x76aaf0b4]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x669020 [0x76a49020]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6696a0 [0x76a496a0]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::mapMemoryAccess(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x3b4 [0x76a4acb0]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x22c [0x76a580dc]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (19) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (20) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: Normalizer: Not normalized instruction found: <8 x f16> %call15 = <8 x f16> madDhS_S_(<8 x f16> %splat.splat, <8 x f16> %vecinit13, <8 x f16> %splat.splat)
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x677344 [0x76a57344]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), void (*)(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x70 [0x76a5c390]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0x76a5bd00]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: General: Invalid parameter type-size: 64
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: General: Invalid parameter type-size: 64
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: General: Invalid parameter type-size: 64
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x677558 [0x76a57558]
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: General: Invalid parameter type-size: 64
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x38c [0x76a5823c]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x6cd3a0 [0x76aad3a0]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPMArea::toWriteSetup(vc4c::DataType const&) const+0x154 [0x76ab1af4]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::VPM::insertWriteVPM(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::periphery::VPMArea const*, bool, vc4c::Value const&)+0xd4 [0x76aafeac]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::periphery::insertWriteDMA(vc4c::Method&, vc4c::InstructionWalker, vc4c::Value const&, vc4c::Value const&, bool)+0xb0 [0x76aaf0b4]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x669020 [0x76a49020]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6696a0 [0x76a496a0]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::mapMemoryAccess(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x3b4 [0x76a4acb0]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x22c [0x76a580dc]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (18) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (19) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (20) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: General: Invalid parameter type-size: 64
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: Normalizer: Not normalized instruction found: f16 %call2 = f16 madDhDhDh(f16 %conv1, f16 %conv, f16 %conv1)
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:24 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x7682e5d0]
[E] Tue Jun 19 18:11:24 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x677344 [0x76a57344]
[E] Tue Jun 19 18:11:24 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), void (*)(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x70 [0x76a5c390]
[E] Tue Jun 19 18:11:24 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0x76a5bd00]
[E] Tue Jun 19 18:11:24 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : +0x677558 [0x76a57558]
[E] Tue Jun 19 18:11:24 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : vc4c::normalization::Normalizer::normalizeMethod(vc4c::Module&, vc4c::Method&) const+0x38c [0x76a5823c]
[E] Tue Jun 19 18:11:24 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : +0x6775f8 [0x76a575f8]
[E] Tue Jun 19 18:11:24 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : +0x678f74 [0x76a58f74]
[E] Tue Jun 19 18:11:24 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Method* const&)>::operator()(vc4c::Method* const&) const+0x54 [0x76834dec]
[E] Tue Jun 19 18:11:24 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}::operator()() const+0xc8 [0x768334ec]
[E] Tue Jun 19 18:11:24 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (), void vc4c::BackgroundWorker::scheduleAll<vc4c::Method*, std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > >(std::vector<vc4c::Method*, std::allocator<vc4c::Method*> > const&, std::function<void (vc4c::Method* const&)> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x24 [0x76838380]
[E] Tue Jun 19 18:11:24 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0x7681f2a4]
[E] Tue Jun 19 18:11:24 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : +0x43e430 [0x7681e430]
[E] Tue Jun 19 18:11:24 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : +0x43ed1c [0x7681ed1c]
[E] Tue Jun 19 18:11:24 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec88 [0x7681ec88]
[E] Tue Jun 19 18:11:24 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : +0x43ec60 [0x7681ec60]
[E] Tue Jun 19 18:11:24 2018: (17) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0x76ec59dc]
[E] Tue Jun 19 18:11:24 2018: Background worker threw error: Normalizer: Not normalized instruction found: <16 x f16> %call31 = <16 x f16> madDhS_S_(<16 x f16> %splat.splat, <16 x f16> %vecinit29, <16 x f16> %splat.splat)
[E] Tue Jun 19 18:11:24 2018: While running worker task: Normalization
[E] Tue Jun 19 18:11:34 2018: Compiler threw exception: Normalizer: Not normalized instruction found: <8 x f16> %call15 = <8 x f16> madDhS_S_(<8 x f16> %splat.splat, <8 x f16> %vecinit13, <8 x f16> %splat.splat)
I have troubled with the following errors when compiling test cases.
What I did is:
Any suggestion?
[ 6%] Building CXX object CMakeFiles/TestVC4CL.dir/test.o
/usr/bin/c++ -DTEST_OUTPUT_CONSOLE=1 -I/home/pi/work/VC4CL/test/.. -I/home/pi/work/VC4CL/test/../include -I/home/pi/work/VC4CL/test/lib/cpptest-lite/include -o CMakeFiles/TestVC4CL.dir/test.o -c /home/pi/work/VC4CL/test/test.cpp
In file included from /home/pi/work/VC4CL/test/../src/extensions.h:10:0,
from /home/pi/work/VC4CL/test/TestExtension.h:11,
from /home/pi/work/VC4CL/test/test.cpp:20:
/home/pi/work/VC4CL/test/../src/common.h:31:106: warning: ignoring attributes on template argument โcl_int {aka int}โ [-Wignored-attributes]
CHECK_RETURN typename std::enable_if<std::is_arithmetic<T>::value | std::is_pointer<T>::value, cl_int>::type
^
In file included from /home/pi/work/VC4CL/test/../src/types.h:12:0,
from /home/pi/work/VC4CL/test/../src/common.h:15,
from /home/pi/work/VC4CL/test/../src/extensions.h:10,
from /home/pi/work/VC4CL/test/TestExtension.h:11,
from /home/pi/work/VC4CL/test/test.cpp:20:
/home/pi/work/VC4CL/test/test.cpp: In function โint main(int, char**)โ:
/home/pi/work/VC4CL/test/test.cpp:45:34: error: โstruct _cl_contextโ has no member named โdispatchโ
assert(offsetof(_cl_context, dispatch) == 0);
^
CMakeFiles/TestVC4CL.dir/build.make:374: recipe for target 'CMakeFiles/TestVC4CL.dir/test.o' failed
The cl kernel I am trying to compile is mcxcl - the code download URL and test commands can be found in Issue #19.
There were two previous errors (precompilation error - due to the missing of CLANG_PATH, fixed in doe300/VC4C#34, and an compilation error - due to the use of 64bit data structures, see #19 (comment)).
To avoid using the 64bit data structures (ulong and double), one can append -J "-DUSE_LL5_RAND" flag in the command line. However, I am now encountering a new error, in the Optimization stage, error log is listed below:
root@raspberrypi:/home/pi/temp/mcxcl/example/benchmark# ./run_benchmark1.sh -n 1e5 -J "-DUSE_LL5_RAND"
...
Building kernel with option: -cl-mad-enable -DMCX_USE_NATIVE -DMCX_SIMPLIFY_BRANCH -DMCX_VECTOR_INDEX -DMCX_SRC_PENCIL -DUSE_LL5_RAND
Kernel build log:
[W] Fri Feb 23 16:27:23 2018: Warnings in precompilation:
[W] Fri Feb 23 16:27:23 2018: <stdin>:1039:1: warning: null character ignored
<U+0000>
^
1 warning generated.
[E] Fri Feb 23 16:27:41 2018: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0xb662977c]
[E] Fri Feb 23 16:27:41 2018: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x520e78 [0xb685de78]
[E] Fri Feb 23 16:27:41 2018: (3) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<vc4c::InstructionWalker (vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&), vc4c::InstructionWalker (*)(vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker&&, vc4c::Configuration const&)+0x7c [0xb6865048]
[E] Fri Feb 23 16:27:41 2018: (4) /usr/local/lib/libVC4CC.so.1.2 : std::function<vc4c::InstructionWalker (vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&)>::operator()(vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x94 [0xb6864624]
[E] Fri Feb 23 16:27:41 2018: (5) /usr/local/lib/libVC4CC.so.1.2 : vc4c::optimizations::OptimizationStep::operator()(vc4c::Module const&, vc4c::Method&, vc4c::InstructionWalker, vc4c::Configuration const&) const+0x54 [0xb685dd58]
[E] Fri Feb 23 16:27:41 2018: (6) /usr/local/lib/libVC4CC.so.1.2 : +0x521158 [0xb685e158]
[E] Fri Feb 23 16:27:41 2018: (7) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&), void (*)(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)>::_M_invoke(std::_Any_data const&, vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)+0x60 [0xb68654a8]
[E] Fri Feb 23 16:27:41 2018: (8) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&)>::operator()(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&) const+0x78 [0xb68644cc]
[E] Fri Feb 23 16:27:41 2018: (9) /usr/local/lib/libVC4CC.so.1.2 : vc4c::optimizations::OptimizationPass::operator()(vc4c::Module const&, vc4c::Method&, vc4c::Configuration const&) const+0x34 [0xb685dbe0]
[E] Fri Feb 23 16:27:41 2018: (10) /usr/local/lib/libVC4CC.so.1.2 : +0x5215a8 [0xb685e5a8]
[E] Fri Feb 23 16:27:41 2018: (11) /usr/local/lib/libVC4CC.so.1.2 : +0x5218c4 [0xb685e8c4]
[E] Fri Feb 23 16:27:41 2018: (12) /usr/local/lib/libVC4CC.so.1.2 : +0x5228ec [0xb685f8ec]
[E] Fri Feb 23 16:27:41 2018: (13) /usr/local/lib/libVC4CC.so.1.2 : std::function<void ()>::operator()() const+0x40 [0xb662f1b4]
[E] Fri Feb 23 16:27:41 2018: (14) /usr/local/lib/libVC4CC.so.1.2 : threading::BackgroundWorker::operator()()::{lambda()#1}::operator()() const+0x58 [0xb662e3d8]
[E] Fri Feb 23 16:27:41 2018: (15) /usr/local/lib/libVC4CC.so.1.2 : void std::_Bind_simple<threading::BackgroundWorker::operator()()::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>)+0x38 [0xb663617c]
[E] Fri Feb 23 16:27:41 2018: (16) /usr/local/lib/libVC4CC.so.1.2 : std::_Bind_simple<threading::BackgroundWorker::operator()()::{lambda()#1} ()>::operator()()+0x1c [0xb66360e8]
[E] Fri Feb 23 16:27:41 2018: (17) /usr/local/lib/libVC4CC.so.1.2 : std::thread::_State_impl<std::_Bind_simple<threading::BackgroundWorker::operator()()::{lambda()#1} ()> >::_M_run()+0x20 [0xb6636054]
[E] Fri Feb 23 16:27:41 2018: (18) /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 : +0x9c9dc [0xb6df99dc]
[E] Fri Feb 23 16:27:41 2018: Background worker threw error: Optimizer: There should be no more function-calls: float %call.i209.%call27.i = float convert_float_rtef(float %call.i209.%tmp.110)
[E] Fri Feb 23 16:27:43 2018: Compiler threw exception: Optimizer: There should be no more function-calls: float %call.i209.%call27.i = float convert_float_rtef(float %call.i209.%tmp.110)
MCX ERROR(2):Error: Failed to build program executable! in unit mcx_host.cpp:475
terminate called without an active exception
Aborted
it is not clear to me if this error was still caused by the accidental involvement of double precision data. the error was raised in the optimizer call to convert_float_rtef function.
@doe300, do you have any comment on what might be wrong? thanks
When executed via the mailbox-interface, the clpeak --compute-sp
benchmark freezes (the kernel?) in the second execution. The --global-bandwidth
benchmark runs fine, so do both benchmarks when run via register-poking.
More exact error description:
VC4CL Queue Handler
thread moves into status D, which means uninterruptible sleep (usually IO)
(see here), which looks like it hangs in the mailbox-syscallVC4CL Queue Handler
thread, even sudo reboot
does nothing.Side note: When this error is fixed, re-examine the delay in src/executor.cpp:108
, since at least the --global-bandwidth
benchmark seems not to require it (anymore).
When building VC4CL, I get:
Linking CXX executable v3d_info
../libVC4CL.so.0.4: undefined reference to `bcm_host_get_peripheral_address'
../libVC4CL.so.0.4: undefined reference to `bcm_host_init'
../libVC4CL.so.0.4: undefined reference to `bcm_host_deinit'
Linking succeeds if I modify tools/CMakeLists.txt
, adding libbcm_host
to the libraries for v3d_info
as follows:
target_link_libraries(v3d_info VC4CL ${BCMHOST_LIBRARY})
A Hello-World example program would be helpful for those struggling to get to grips with it (and OpenCL in general).
I have successfully compiled and installed VC4C, VC4CL and confirmed that both are working; when I do clinfo I get OpenCL information on my GPU and
[VC4CL] can't open /dev/mem
[VC4CL] This program should be run as root. Try prefixing command with: sudo
terminate called after throwing an instance of 'std::system_error'
what(): Failed to open /dev/mem: Operation not permitted
aborted
which is expected as running any VC4CL implementation requires root access. However, when I do "sudo clinfo", I get Number of platforms 0
as output.
Prior to the above, I created file /etc/OpenCL/vendors/VC4CL.icd which contains one line: /usr/local/lib/libVC4CL.so
(I also tried libVC4CL.so as the one line but to no avail) as per instructions.
Please help me, thanks.
Has anyone been able to run OpenCV on the Pi GPU using OpenCL? Is there an example somewhere demonstrating accessing the GPU with OpenCV using OpenCL?
Hi.
Thank you for creating VC4VL. I am very eager to implement it on several machines.
On a RPi, I have compiled and installed VC4C and VC4CL including tools, but not test.
I do not yet have VC4VL functional. Details of three problems follow.
(1) In VC4C ./build/test/TestVC4C gave an enormity of errors! I mean page after page. I issued this command, as root, after cmake, make, and make install of both VC4CLStdLib and VC4C (but not VC4CL). After about 45min, and two and a half pages of
"64-bit operations are not supported by the VideoCore IV architecture, further compilation may fail!"
the program stopped. Another shell showed cpu usage at 100%, so after 15min, I killed the program.
Is this behavior normal?
(2) When I issue "cmake" in the directory named "test" I see,
cmake ../test
-- The C compiler identification is GNU 6.3.0
-- The CXX compiler identification is GNU 6.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:10 (ExternalProject_Get_Property):
Unknown CMake command "ExternalProject_Get_Property".
CMake Warning (dev) in CMakeLists.txt:
No cmake_minimum_required command is present. A line of code such as
cmake_minimum_required(VERSION 3.7)
should be added at the top of the file. The version specified may be lower
if you wish to support older CMake versions for this project. For more
information run "cmake --help-policy CMP0000".
This warning is for project developers. Use -Wno-dev to suppress it.
-- Configuring incomplete, errors occurred!
See also "/home/VC4C/VC4CL/test/CMakeFiles/CMakeOutput.log".
(3) clinfo shows Device 0 when mesa-opencl-icd is installed, which I presume means it is a superfluous. Please, correct me if I am wrong. I removed mesa-opencl-icd and Device 0 disappeared.
I want to do a comparison of efficacy between the GPU's and the CPU's use of OpenCL. How may I implement OpenCL support for the RPi's CPU?
The full output of clinfo is below. Note, I had to use the prefix for clinfo to work.
Number of platforms 1
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Platform Vendor doe300
Platform Version OpenCL 1.2 VC4CL 0.4
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_il_program cl_khr_spir cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_vc4cl_performance_counters
Platform Extensions function suffix VC4CL
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices 1
Device Name VideoCore IV GPU
Device Vendor Broadcom
Device Vendor ID 0xa5c
Device Version OpenCL 1.2 VC4CL 0.4
Driver Version 0.4
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile EMBEDDED_PROFILE
Max compute units 1
Max clock frequency 300MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Max work item dimensions 3
Max work item sizes 12x12x12
Max work group size 12
Preferred work group size multiple 1
Preferred / native vector sizes
char 16 / 16
short 16 / 16
int 16 / 16
long 0 / 0
half 0 / 0 (n/a)
float 16 / 16
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 67108864 (64MiB)
Error Correction support No
Max memory allocation 67108864 (64MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 64 bytes
Alignment of base address 512 bits (64 bytes)
Global Memory cache type Read/Write
Global Memory cache size <printDeviceInfo:89: get CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : error -30>
Global Memory cache line 64 bytes
Image support No
Local memory type Global
Local memory size 67108864 (64MiB)
Max constant buffer size 67108864 (64MiB)
Max number of constant args 64
Max size of kernel argument 256
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 0
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_nv_pragma_unroll cl_arm_core_id cl_ext_atomic_counters_32 cl_khr_initialize_memory cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) OpenCL for the Raspberry Pi VideoCore IV GPU
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [VC4CL]
clCreateContext(NULL, ...) [default] Success [VC4CL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1
How may I bring up the CPU as a second (and/or alternative) device?
My summary questions are as follows.
(1) How may I run clinfo (and other programs) without LD_LIBRARY_PATH=/usr/local/lib/ preceding a command short of making symlinks to /usr/lib/?
(2) How may I bring up the CPU as a second (and/or alternative) device?
(3) The device in clinfo's output is not usable owing to errors such as vc4c::CompilationError::CompilationError,vc4c::precompilation::compileOpenCLWithDefaultHeader, etc. I need to know how to diagnose the underlying cause. Is ./build/test/TestVC4C supposed to give multiple pages of errors and then freeze? If so, where else can I look?
I would be very grateful for any assistance with these problems.
I compiled and installed VC4CL running OpenCV 4.01, Qt5 and LLVM3.9 packages installed, and when I run my program I get this error:
[ INFO:0] Initialize OpenCL runtime...
OpenCV(ocl4dnn): consider to specify kernel configuration cache directory
via OPENCV_OCL4DNN_CONFIG_PATH parameter.
[ INFO:0] Successfully initialized OpenCL cache directory: /root/.cache/opencv/4.0/opencl_cache/
[ INFO:0] Preparing OpenCL cache configuration for context: 32-bit--Broadcom--VideoCore_IV_GPU--0_4
OpenCL program build log: dnn/dummy
Status -15: CL_COMPILE_PROGRAM_FAILURE
-cl-no-subgroup-ifp
[E] Sat Mar 9 14:55:07 2019: Errors in precompilation:
[E] Sat Mar 9 14:55:07 2019: error: unknown argument: '-cl-no-subgroup-ifp'
OpenCL program build log: dnn/conv_layer_spatial
Status -15: CL_COMPILE_PROGRAM_FAILURE
-D TYPE=1 -D Dtype=float -D Dtype2=float2 -D Dtype4=float4 -D Dtype8=float8 -D Dtype16=float16 -D as_Dtype=as_float -D as_Dtype2=as_float2 -D as_Dtype4=as_float4 -D as_Dtype8=as_float8 -D KERNEL_WIDTH=3 -D KERNEL_HEIGHT=3 -D STRIDE_X=2 -D STRIDE_Y=2 -D DILATION_X=1 -D DILATION_Y=1 -D KERNEL_BASIC -cl-fast-relaxed-math -D ConvolveBasic=BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1 -D CHANNELS=3 -D APPLY_BIAS=1 -D OUTPUT_Z=32 -D ZPAR=1 -D FUSED_CONV_RELU=1
[W] Sat Mar 9 14:55:09 2019: Warnings in precompilation:
[W] Sat Mar 9 14:55:09 2019: :1484:1: warning: null character ignored
<U+0000>
^
1 warning generated.
[E] Sat Mar 9 14:55:09 2019: Errors in precompilation:
[E] Sat Mar 9 14:55:09 2019: ERROR: Invalid value (Producer: 'LLVM6.0.0svn' Reader: 'LLVM 3.9.1')
/usr/bin/llvm-link: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc: error: Corrupted bitcode
/usr/bin/llvm-link: error loading file '/usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc'
Failed to compile kernel: BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1, buildflags: -D TYPE=1 -D Dtype=float -D Dtype2=float2 -D Dtype4=float4 -D Dtype8=float8 -D Dtype16=float16 -D as_Dtype=as_float -D as_Dtype2=as_float2 -D as_Dtype4=as_float4 -D as_Dtype8=as_float8 -D KERNEL_WIDTH=3 -D KERNEL_HEIGHT=3 -D STRIDE_X=2 -D STRIDE_Y=2 -D DILATION_X=1 -D DILATION_Y=1 -D KERNEL_BASIC -cl-fast-relaxed-math -D ConvolveBasic=BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1 -D CHANNELS=3 -D APPLY_BIAS=1 -D OUTPUT_Z=32 -D ZPAR=1 -D FUSED_CONV_RELU=1, errmsg: [W] Sat Mar 9 14:55:09 2019: Warnings in precompilation:
[W] Sat Mar 9 14:55:09 2019: :1484:1: warning: null character ignored
<U+0000>
^
1 warning generated.
[E] Sat Mar 9 14:55:09 2019: Errors in precompilation:
[E] Sat Mar 9 14:55:09 2019: ERROR: Invalid value (Producer: 'LLVM6.0.0svn' Reader: 'LLVM 3.9.1')
/usr/bin/llvm-link: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc: error: Corrupted bitcode
/usr/bin/llvm-link: error loading file '/usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc'
It appears to have 2 issues, the command argument, -cl-no-subgroup-ifp, and 'corrupted bitcode' for the file VC4CLStdLib.bc
Can you advise on how to fix this?
I am trying to check VC4CL works correctly.
In the following code, the output is differenct from what I expected and the one of NVIDIA env.
kernel
kernel void hello(global float * x){
int ind = get_global_id(0);
x[ind] = x[ind] * 2;
}
host code can found in https://github.com/nomaddo/opencl-benchmark/blob/master/gpu.c .
This host code just take kernel file name, function name, num of args, and the length of each argument..
Then execute the kernel, and print all arguments.
pi@nomaddo-pi3:~/opencl-benchmark$ sudo ./gpu mul.cl hello 1 10
0.000000
1.000000
2.000000
3.000000
4.000000
10.000000
6.000000
7.000000
8.000000
9.000000
Runtime: 0.000419ms
The expected result (and the one of NVIDIA GPU) is as follows:
nomaddo@nomaddo-AS:~/opencl-benchmark$ ./gpu mul.cl hello 1 10
0.000000
2.000000
4.000000
6.000000
8.000000
10.000000
12.000000
14.000000
16.000000
18.000000
Runtime: 0.000070ms
I use the latest of VC4C
built by circleci, and self-compiled VC4CL
, which also come from latest source-code.
I installed VC4CL and try to run "clinfo" ,its show like these:
""
[VC4CL] can't open /dev/mem
[VC4CL] This program should be run as root. Try prefixing command with: sudo
terminate called after throwing an instance of 'std::system_error'
what(): Failed to open /dev/mem: Permission denied
Aborted
""
When i try to run "sudo clinfo", its OK.
When i test with PlaidML by "plaidml-setup", the same error:
""
[VC4CL] can't open /dev/mem
[VC4CL] This program should be run as root. Try prefixing command with: sudo
terminate called after throwing an instance of 'std::system_error'
what(): Failed to open /dev/mem: Permission denied
Aborted
""
When i test with PlaidML by "sudo plaidml-setup", its cannot found device:
""
No supported devices found. Run 'clinfo' and file an issue containing the full output.
""
Anyone can help me.
Thanks.
Using the instructions here 'https://github.com/doe300/VC4CL/wiki/How-to-get'
I'm stumbling at the first stage of installing, there doesn't appear to be a working link or on the nightly build either.
I must be missing something obvious could you point me in the right direction please.
Any fast ideas whats going on?
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 136.308/136.308/136.308/0.000 ms
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - found
-- Found OpenCL: /usr/lib/libOpenCL.so (found version "2.2")
-- Found headers for OpenCL up to version 2.2 located in /usr/include
-- VC4C library found: /usr/lib/libVC4CC.so
-- VC4C compiler header found at: /usr/include/vc4cc/VC4C.h
-- Building with ICD support
-- Checking for one of the modules 'ocl-icd>=1.3'
-- Found Khronos ICD Loader in version 2.2.12 in /usr/lib
-- Enabling register-poking to run kernels
-- found clang-format: /usr/bin/clang-format
-- Configuring done
-- Generating done
-- Build files have been written to: /home/alex/packages/vc4cl_arch/src/build
Scanning dependencies of target generate_icd
[ 3%] Generating VC4CL.icd
[ 3%] Built target generate_icd
Scanning dependencies of target VC4CL
[ 6%] Building CXX object build/CMakeFiles/VC4CL.dir/CommandQueue.cpp.o
[ 10%] Building CXX object build/CMakeFiles/VC4CL.dir/Buffer.cpp.o
c++: fatal error: no input files
compilation terminated.
distcc[1055] ERROR: compile (null) on localhost failed
c++: fatal error: no input files
compilation terminated.
distcc[1056] ERROR: compile (null) on localhost failed
/bin/sh: -g3: command not found
make[2]: *** [build/CMakeFiles/VC4CL.dir/build.make:63: build/CMakeFiles/VC4CL.dir/Buffer.cpp.o] Error 127
make[2]: *** Waiting for unfinished jobs....
/bin/sh: -g3: command not found
make[2]: *** [build/CMakeFiles/VC4CL.dir/build.make:76: build/CMakeFiles/VC4CL.dir/CommandQueue.cpp.o] Error 127
make[1]: *** [CMakeFiles/Makefile2:183: build/CMakeFiles/VC4CL.dir/all] Error 2
make: *** [Makefile:130: all] Error 2
==> ERROR: A failure occurred in build().
Aborting...
Using this:
https://github.com/alexzk1/vc4cl_arch/blob/master/PKGBUILD
When building a debian package on RPi using...
mkdir build && cd build
cmake .. -DBUILD_DEBUG=ON -DBUILD_DEB_PACKAGE=ON -DBUILD_TESTING=ON -DCROSS_COMPILE=ON -DBUILD_ICD=ON -DREGISTER_POKE_KERNELS=ON -DIMAGE_SUPPORT=ON
make
cpack -G DEB
an error is output concerning a missing VC4CLIdl.
Pack: Create package using DEB
CPack: Install projects
CPack: - Run preinstall target for: VC4CL
CPack: - Install project: VC4CL
CMake Error at /usr/local/src/VC4CL/build/cmake_install.cmake:44 (file):
file INSTALL cannot find "/usr/local/src/VC4CL/VC4CL.icd".
CPack Error: Error when generating package: vc4cl
It can be fixed by copying VC4CL.icd to the parent directory
cp VC4CL.icd ..
At some point this should be fixed in the build files... it's a minor issue, I've posted it here for reference.
...and getting
INFO:0] Initialize OpenCL runtime...
[ INFO:0] Successfully initialized OpenCL cache directory: /root/.cache/opencv/3.4.1/opencl_cache/
[ INFO:0] Preparing OpenCL cache configuration for context: 32-bit--Broadcom--VideoCore_IV_GPU--0_4
OpenCV(3.4.1) Error: Unknown error code -220 (OpenCL error CL_INVALID_VALUE (-30) during call: clGetProgramInfo(handle, CL_PROGRAM_BINARIES, sizeof(ptr), &ptr, NULL)) in getProgramBinary, file /build/opencv/src/opencv-3.4.1/modules/core/src/ocl.cpp, line 3752
[ WARN:0] Can't save OpenCL binary into cache: /root/.cache/opencv/3.4.1/opencl_cache/32-bit--Broadcom--VideoCore_IV_GPU--0_4/imgproc--filterSepRow_e99b92fca8604fe253f3c641802ce117.bin
OpenCV(3.4.1) /build/opencv/src/opencv-3.4.1/modules/core/src/ocl.cpp:3752: error: (-220) OpenCL error CL_INVALID_VALUE (-30) during call: clGetProgramInfo(handle, CL_PROGRAM_BINARIES, sizeof(ptr), &ptr, NULL) in function getProgramBinaryOpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('row_filter_C1_D0', dims=2, globalsize=240x544x1, localsize=16x16x1) sync=false
OpenCV(3.4.1) Error: Unknown error code -220 (OpenCL error CL_INVALID_VALUE (-30) during call: clGetProgramInfo(handle, CL_PROGRAM_BINARIES, sizeof(ptr), &ptr, NULL)) in getProgramBinary, file /build/opencv/src/opencv-3.4.1/modules/core/src/ocl.cpp, line 3752
[ WARN:0] Can't save OpenCL binary into cache: /root/.cache/opencv/3.4.1/opencl_cache/32-bit--Broadcom--VideoCore_IV_GPU--0_4/imgproc--filterSepRow_e99b92fca8604fe253f3c641802ce117.bin
OpenCV(3.4.1) /build/opencv/src/opencv-3.4.1/modules/core/src/ocl.cpp:3752: error: (-220) OpenCL error CL_INVALID_VALUE (-30) during call: clGetProgramInfo(handle, CL_PROGRAM_BINARIES, sizeof(ptr), &ptr, NULL) in function getProgramBinaryOpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('row_filter_C1_D0', dims=2, globalsize=240x544x1, localsize=16x16x1) sync=false
FPS 0.0205593, Objects: 0
OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('row_filter_C1_D0', dims=2, globalsize=240x544x1, localsize=16x16x1) sync=false
OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('row_filter_C1_D0', dims=2, globalsize=240x544x1, localsize=16x16x1) sync=false
Is it problem of what ? >: Not supported features, rights, opencl?
Made just in case
sudo chmod 777 /root/.cache/opencv/3.4.1/opencl_cache/32-bit--Broadcom--VideoCore_IV_GPU--0_4
and didnt work
Hello.
I installed all in ubuntu 18.04 and confirmed that VC4C is working.
However, while creating a cl::Buffer I get this error message:
[VC4CL] test_cl.o: ioctl_set_msg failed: -1
[VC4CL] Error in mbox_propertie: Inappropriate ioctl for device
Any idea what this may be?
Thanks in advance.
doe300, as you know, back on the Qrack project's Raspberry Pi 3 support pull request at unitaryfund/qrack#143, we get a segfault if our kernel calls receive a large list of events to wait on, but not if the wait list is manually waited on before calling the kernel.
The wait list is daisy-chained. We pass only the direct wait list dependencies for the kernel, while the events in the list can depend on their own separate sets of dependencies that aren't in the kernel list.
We have this pattern, (which works):
#if ENABLE_RASPBERRYPI
clFinish();
#endif
// Dispatch the primary kernel, to apply the gate.
cl::Event kernelEvent;
std::vector<cl::Event> kernelWaitVec = device_context->ResetWaitEvents();
queue.enqueueNDRangeKernel(ocl.call, cl::NullRange, // kernel, offset
cl::NDRange(workItemCount), // global number of work items
cl::NDRange(localGroupSize), // local number (per group)
&kernelWaitVec, // vector of events to wait for
&kernelEvent); // handle to wait for the kernel
queue.flush();
#if ENABLE_RASPBERRYPI
clFinish();
#endif
The clFinish()
method is a user code "soft finish" implementation; all it does is manually wait on the list of events returned by device_context->ResetWaitEvents()
, and it clears the list. In the simplest case we've tested, the list of events just read into the set of buffers needed before calling the kernel. We pass these events in the list to the kernel, ans we manually wait on the same buffer loads before exiting our method, to make sure that the loads complete before we could lose ephemeral inputs for the buffers.
This implementation has been tested extensively on NVIDIA GTX cards, Intel Core HDs, Intel Cores, and and Intel Xeons, and it's never been an issue anywhere except on the Raspberry Pi.
I think this kind of fall-through asynchronous behavior is asking a lot of the Raspberry Pi's resources, but I don't think there's a logical problem with the chain of dependencies, if we remove the clFinish()
calls for the Raspberry Pi.
Candidates:
cl_khr_fp16
(OpenCL 1.2 extension specification, section 9.5): could calculate as float
, use un/pack modes to load/storecl_khr_image2d_from_buffer
(OpenCL 1.2 extension specification, section 9.14): at least for raster texture-formats, this would be no problem.cl_khr_initialize_memory
(OpenCL 1.2 extension specification, section 9.15): local memory is already initialized to zero by the compiler if no other initial value is explicitly set; private memory is initialized to zero, if configured this way (via the extension).cl_khr_terminate_context
(OpenCL 1.2 extension specification, section 9.16): requires the currently running action (e.g. kernel execution) to be aborted, currently not supported.cl_img_cached_allocations
(here): the Mailbox supports (un)cached allocation, need to check if it has any side-effects cl_khr_create_command_queue
: Adds extension function which maps directly to OpenCL 2.x clCreateCommandQueueWithProperties
function already supported.When I run my QtApp which is linked to run the Experimental VC4 (full KMS) driver and OpenGL, I get a blue screen when running OpenCL with OpenCV. It appears the mesa driver and VC4CL are stomping on each other and are not compatible.
(On a freshly installed Raspbian distribution) test installation and usage of the three created debian packages built with CircleCI.
Things to test:
/usr/local/include/vc4cl-stdlib
?libVC4CC.so
in /usr/local/lib
?clinfo
print correct results?v3d_info
and v3d_profile
work?I am not sure what happned, but test cases seem failed.
Thanks to #1, I have compiled test cases by adhoc-way.
Unknown CMake command "ExternalProject_Get_Property".
TestVC4CL
by man-hand#define HAS_COMPILER 1
VC4CL_VERSION
as constant string to avoid compilation-errorCan you teach me what happened?
Or, Can you tell me the way to construct a right env (my env may be wrong...)
pi@nomaddo:~/work/VC4CL/test$ sudo ./TestVC4CL
Running suite 'TestSystem' with 1 tests...
Suite 'TestSystem' finished, 1/1 successful (100%) in 48 microseconds (0.048 ms).
Running suite 'TestPlatform' with 2 tests...
Test 'TestPlatform::testGetPlatformInfo()' failed!
Suite: TestPlatform
File: TestPlatform.cpp
Line: 56
Failure: Assertion 'platform_config::VERSION.compare(buffer) == 0' failed
Test-method 'TestPlatform::testGetPlatformInfo()' finished with errors!
Suite 'TestPlatform' finished, 1/2 successful (50%) in 401 microseconds (0.401 ms).
Running suite 'TestDevice' with 5 tests...
Suite 'TestDevice' finished, 5/5 successful (100%) in 1665 microseconds (1.665 ms).
Running suite 'TestContext' with 5 tests...
Suite 'TestContext' finished, 5/5 successful (100%) in 198 microseconds (0.198 ms).
Running suite 'TestCommandQueue' with 4 tests...
Suite 'TestCommandQueue' finished, 4/4 successful (100%) in 568 microseconds (0.568 ms).
Running suite 'TestBuffer' with 17 tests...
Suite 'TestBuffer' finished, 17/17 successful (100%) in 1536 microseconds (1.536 ms).
Running suite '' with 0 tests...
Suite '' finished, 0/0 successful (0%) in 0 microseconds (0 ms).
Running suite 'TestProgram' with 11 tests...
Test 'TestProgram::testCompileProgram()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 109
Failure: Got -3, expected 0
Test-method 'TestProgram::testCompileProgram()' finished with errors!
Test 'TestProgram::testLinkProgram()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 118
Failure: Got -59, expected 0
Test 'TestProgram::testLinkProgram()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 119
Failure: Got 0, expected 0x138a06c
Test-method 'TestProgram::testLinkProgram()' finished with errors!
Test 'TestProgram::testUnloadPlatformCompiler()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 129
Failure: Got 5, expected 13
Test-method 'TestProgram::testUnloadPlatformCompiler()' finished with errors!
Test 'TestProgram::testGetProgramInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 180
Failure: Got 0, expected -30
Test 'TestProgram::testGetProgramInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 199
Failure: Got -45, expected 0
Test 'TestProgram::testGetProgramInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 200
Failure: Got 0, expected 4
Test 'TestProgram::testGetProgramInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 201
Failure: Got 0, expected 1
Test 'TestProgram::testGetProgramInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 204
Failure: Got -45, expected 0
Test-method 'TestProgram::testGetProgramInfo()' finished with errors!
Test 'TestProgram::testGetProgramBuildInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 139
Failure: Got -1, expected 0
Test 'TestProgram::testGetProgramBuildInfo()' failed!
Suite: TestProgram
File: TestProgram.cpp
Line: 152
Failure: Got 0, expected 4
Test-method 'TestProgram::testGetProgramBuildInfo()' finished with errors!
Suite 'TestProgram' finished, 6/11 successful (54.54%) in 5657030 microseconds (5657.03 ms).
Running suite 'TestKernel' with 13 tests...
Test 'TestKernel::testCreateKernel()' failed!
Suite: TestKernel
File: TestKernel.cpp
Line: 62
Failure: Got -45, expected 0
Test 'TestKernel::testCreateKernel()' failed!
Suite: TestKernel
File: TestKernel.cpp
Line: 63
Failure: Assertion 'kernel != NULL' failed
Test-method 'TestKernel::testCreateKernel()' finished with errors!
Test 'TestKernel::testCreateKernelsInProgram()' failed!
Suite: TestKernel
File: TestKernel.cpp
Line: 71
Failure: Got -45, expected 0
Test 'TestKernel::testCreateKernelsInProgram()' failed!
Suite: TestKernel
File: TestKernel.cpp
Line: 72
Failure: Got 0, expected 1
Segmentation fault
Currently, VC4C and VC4CL are held compatible with the ancient GCC 4.8 to support older Raspbian systems. GCC 4.8 barely supports C++11 [1] and hence lacks a lot of useful new features provided in C++14 and C++17. The latest Raspian Stretch ships with GCC 6.3, which completely supports C++14 as well as a lot of C++17 features [1].
Some useful C++14/C++17 features we could use when switching [1]:
__has_include
e.g. for distinguishing LLVM versionsstring_view
, file-system support, optional
and any
in experimental namespaceSince the official raspberrypi-tools repository has not yet been upgraded to GCC 6.3 (see here), we would need to use another tool-chain for our CircleCI cross-compilation such as this one (see also here).
So the question is:
Should we drop support for GCC 4.8 and therefore for older Raspbian systems?
on arch linux arm aarch64, clang7
Scanning dependencies of target generate_icd
[ 3%] Generating VC4CL.icd
[ 3%] Built target generate_icd
Scanning dependencies of target VC4CL
[ 7%] Building CXX object src/CMakeFiles/VC4CL.dir/barriers.cpp.o
[ 10%] Building CXX object src/CMakeFiles/VC4CL.dir/Buffer.cpp.o
[ 14%] Building CXX object src/CMakeFiles/VC4CL.dir/CommandQueue.cpp.o
[ 17%] Building CXX object src/CMakeFiles/VC4CL.dir/common.cpp.o
[ 21%] Building CXX object src/CMakeFiles/VC4CL.dir/Context.cpp.o
[ 25%] Building CXX object src/CMakeFiles/VC4CL.dir/Device.cpp.o
[ 28%] Building CXX object src/CMakeFiles/VC4CL.dir/Event.cpp.o
[ 32%] Building CXX object src/CMakeFiles/VC4CL.dir/executor.cpp.o
/home/as3ii/VC4CL/src/executor.cpp: In function 'unsigned int AS_GPU_ADDRESS(const unsigned int*, vc4cl::DeviceBuffer*)':
/home/as3ii/VC4CL/src/executor.cpp:39:24: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
const char* tmp = *reinterpret_cast<const char**>(&ptr);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/as3ii/VC4CL/src/executor.cpp: In function 'cl_int executeKernel(vc4cl::KernelExecution&)':
/home/as3ii/VC4CL/src/executor.cpp:261:79: warning: conversion from 'long unsigned int' to 'uint32_t' {aka 'unsigned int'} may change value
-Wconversion]
uint32_t stackFrameSize = kernel->program->moduleInfo.getStackFrameSize() * sizeof(uint64_t);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
[ 35%] Building CXX object src/CMakeFiles/VC4CL.dir/extensions.cpp.o
[ 39%] Building CXX object src/CMakeFiles/VC4CL.dir/icd_loader.cpp.o
[ 42%] Building CXX object src/CMakeFiles/VC4CL.dir/Image.cpp.o
[ 46%] Building CXX object src/CMakeFiles/VC4CL.dir/Kernel.cpp.o
[ 50%] Building CXX object src/CMakeFiles/VC4CL.dir/Mailbox.cpp.o
[ 53%] Building CXX object src/CMakeFiles/VC4CL.dir/ObjectTracker.cpp.o
[ 57%] Building CXX object src/CMakeFiles/VC4CL.dir/PerformanceCounter.cpp.o
/home/as3ii/VC4CL/src/PerformanceCounter.cpp: In member function 'cl_int vc4cl::PerformanceCounter::getValue(cl_uint*) const':
/home/as3ii/VC4CL/src/PerformanceCounter.cpp:43:40: warning: conversion from 'int64_t' {aka 'long int'} to 'cl_uint' {aka 'unsigned int'} may change value [-Wconversion]
*value = V3D::instance().getCounter(index);
~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
[ 60%] Building CXX object src/CMakeFiles/VC4CL.dir/Platform.cpp.o
[ 64%] Building CXX object src/CMakeFiles/VC4CL.dir/Program.cpp.o
[ 67%] Building CXX object src/CMakeFiles/VC4CL.dir/queue_handler.cpp.o
[ 71%] Building CXX object src/CMakeFiles/VC4CL.dir/TextureFormat.cpp.o
[ 75%] Building CXX object src/CMakeFiles/VC4CL.dir/V3D.cpp.o
[ 78%] Linking CXX shared library libVC4CL.so
[ 78%] Built target VC4CL
Scanning dependencies of target v3d_profile
[ 82%] Building CXX object tools/CMakeFiles/v3d_profile.dir/V3DProfile.cpp.o
In file included from /home/as3ii/VC4CL/tools/../src/V3D.h:10,
from /home/as3ii/VC4CL/tools/common.h:10,
from /home/as3ii/VC4CL/tools/V3DProfile.cpp:7:
/home/as3ii/VC4CL/tools/../src/common.h:46:106: warning: ignoring attributes on template argument 'cl_int' {aka 'int'} [-Wignored-attributes]
CHECK_RETURN typename std::enable_if<std::is_arithmetic<T>::value | std::is_pointer<T>::value, cl_int>::type
^
[ 85%] Linking CXX executable v3d_profile
/usr/bin/ld: ../src/libVC4CL.so.0.4: undefined reference to `bcm_host_deinit'
/usr/bin/ld: ../src/libVC4CL.so.0.4: undefined reference to `bcm_host_init'
/usr/bin/ld: ../src/libVC4CL.so.0.4: undefined reference to `bcm_host_get_peripheral_address'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/v3d_profile.dir/build.make:88: tools/v3d_profile] Error 1
make[1]: *** [CMakeFiles/Makefile2:247: tools/CMakeFiles/v3d_profile.dir/all] Error 2
make: *** [Makefile:130: all] Error 2
I clone and build the CV4C and SPIRV-LLVM but can't pass the test.Could you provide a Image on Raspberry PI 3?
Hi,
I am the maintainer of OpenCL Caffe (https://github.com/naibaf7/caffe) and (https://github.com/BVLC/caffe/tree/opencl).
I would like to get this running on VC4CL, but I am facing some issues. But at it's core, it should be possible because Caffe can be run solely based on internal OpenCL kernels now (no external OpenCL library dependency, except for ViennaCL's infrastructure (but not kernels). This also means full control on what work group sizes are required by the kernels, and it's possible to add quirks/workarounds specific to VC4CL.
Is there a way to chat with @doe300 directly, as this may get quite involved (due to the scope of OpenCL Caffe).
Hello.
I'm using a Raspberry Pi 3 to optimise some code. I've already run some test programs like summing two arrays, etc.
Now, I'm stuck with the following error.
[D] Thu Mar 7 17:04:25 2019: Compiling 'src/libviso2/libviso2/kernels/get_inliers.cl' into 'out' with optimization level 2 and options '' ...
[D] Thu Mar 7 17:04:25 2019: Temporary file '/tmp/vc4c-EQ5dwJ' created
[D] Thu Mar 7 17:04:25 2019: Temporary file '/tmp/vc4c-i0dILi' created
[I] Thu Mar 7 17:04:25 2019: Compiling OpenCL to LLVM-IR with: /usr/bin/clang-3.9 -cc1 -triple spir-unknown-unknown -I src/libviso2/libviso2/kernels -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -finclude-default-header -include /usr/local/include/vc4cl-stdlib/defines.h -x cl -S -emit-llvm-bc -o /tmp/vc4c-i0dILi src/libviso2/libviso2/kernels/get_inliers.cl
[E] Thu Mar 7 17:04:25 2019: Errors in precompilation:
[E] Thu Mar 7 17:04:25 2019: Stack dump:
0. Program arguments: /usr/bin/clang-3.9 -cc1 -triple spir-unknown-unknown -I src/libviso2/libviso2/kernels -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -finclude-default-header -include /usr/local/include/vc4cl-stdlib/defines.h -x cl -S -emit-llvm-bc -o /tmp/vc4c-i0dILi src/libviso2/libviso2/kernels/get_inliers.cl
Segmentation fault
[E] Thu Mar 7 17:04:25 2019: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x76aa9d44]
[E] Thu Mar 7 17:04:25 2019: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x8200ec [0x76dc00ec]
[E] Thu Mar 7 17:04:25 2019: (3) /usr/local/lib/libVC4CC.so.1.2 : +0x820498 [0x76dc0498]
[E] Thu Mar 7 17:04:25 2019: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::precompilation::compileOpenCLWithDefaultHeader(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x64 [0x76dc0860]
[E] Thu Mar 7 17:04:25 2019: (5) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), void (*)(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x60 [0x76dc525c]
[E] Thu Mar 7 17:04:25 2019: (6) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76dc4200]
[E] Thu Mar 7 17:04:25 2019: (7) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0xa0 [0x76dc336c]
[E] Thu Mar 7 17:04:25 2019: (8) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x5c [0x76dc5630]
[E] Thu Mar 7 17:04:25 2019: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76dc4200]
[E] Thu Mar 7 17:04:25 2019: (10) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Precompiler::run(std::unique_ptr<std::istream, std::default_delete<std::istream> >&, vc4c::SourceType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >)+0x608 [0x76dca260]
[E] Thu Mar 7 17:04:25 2019: (11) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Precompiler::precompile(std::istream&, std::unique_ptr<std::istream, std::default_delete<std::istream> >&, vc4c::Configuration, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >)+0x164 [0x76dc7fd8]
[E] Thu Mar 7 17:04:25 2019: (12) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Compiler::compile(std::istream&, std::ostream&, vc4c::Configuration, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)+0xc4 [0x76aab5dc]
[E] Thu Mar 7 17:04:25 2019: (13) VC4C : main+0xf04 [0x375a4]
[E] Thu Mar 7 17:04:25 2019: (14) /lib/arm-linux-gnueabihf/libc.so.6 : __libc_start_main+0x114 [0x73be5678]
[D] Thu Mar 7 17:04:25 2019: Temporary file '/tmp/vc4c-i0dILi' deleted
[D] Thu Mar 7 17:04:25 2019: Temporary file '/tmp/vc4c-EQ5dwJ' deleted
[E] Thu Mar 7 17:04:25 2019: Compiler threw exception: Pre-compilation: Error in precompilation: Stack dump:
0. Program arguments: /usr/bin/clang-3.9 -cc1 -triple spir-unknown-unknown -I src/libviso2/libviso2/kernels -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -finclude-default-header -include /usr/local/include/vc4cl-stdlib/defines.h -x cl -S -emit-llvm-bc -o /tmp/vc4c-i0dILi src/libviso2/libviso2/kernels/get_inliers.cl
Segmentation fault
terminate called after throwing an instance of 'vc4c::CompilationError'
what(): Pre-compilation: Error in precompilation: Stack dump:
0. Program arguments: /usr/bin/clang-3.9 -cc1 -triple spir-unknown-unknown -I src/libviso2/libviso2/kernels -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -finclude-default-header -include /usr/local/include/vc4cl-stdlib/defines.h -x cl -S -emit-llvm-bc -o /tmp/vc4c-i0dILi src/libviso2/libviso2/kernels/get_inliers.cl
Segmentation fault
Aborted
This happens even in a kernel that had already worked.
The error returned by OpenCL is CL_COMPILE_PROGRAM_FAILURE (-15).
Any ideas?
Thanks in advance.
Edit:
Sometimes, the error is this one...
[D] Thu Mar 7 17:18:02 2019: Compiling 'kernels/kernel.cl' into 'out' with optimization level 2 and options '' ...
[D] Thu Mar 7 17:18:02 2019: Temporary file '/tmp/vc4c-2aReX3' created
[D] Thu Mar 7 17:18:02 2019: Temporary file '/tmp/vc4c-cPNtDX' created
[I] Thu Mar 7 17:18:02 2019: Compiling OpenCL to LLVM-IR with: /usr/bin/clang-3.9 -cc1 -triple spir-unknown-unknown -I kernels -O3 -ffp-contract=off -cl-std=CL1.2 -cl-kernel-arg-info -cl-single-precision-constant -Wno-undefined-inline -Wno-unused-parameter -Wno-unused-local-typedef -Wno-gcc-compat -finclude-default-header -include /usr/local/include/vc4cl-stdlib/defines.h -x cl -S -emit-llvm-bc -o /tmp/vc4c-cPNtDX kernels/kernel.cl
[E] Thu Mar 7 17:18:02 2019: Errors in precompilation:
[E] Thu Mar 7 17:18:02 2019: Illegal instruction
[E] Thu Mar 7 17:18:02 2019: (1) /usr/local/lib/libVC4CC.so.1.2 : vc4c::CompilationError::CompilationError(vc4c::CompilationStep, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xf4 [0x76a8fd44]
[E] Thu Mar 7 17:18:02 2019: (2) /usr/local/lib/libVC4CC.so.1.2 : +0x8200ec [0x76da60ec]
[E] Thu Mar 7 17:18:02 2019: (3) /usr/local/lib/libVC4CC.so.1.2 : +0x820498 [0x76da6498]
[E] Thu Mar 7 17:18:02 2019: (4) /usr/local/lib/libVC4CC.so.1.2 : vc4c::precompilation::compileOpenCLWithDefaultHeader(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x64 [0x76da6860]
[E] Thu Mar 7 17:18:02 2019: (5) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), void (*)(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x60 [0x76dab25c]
[E] Thu Mar 7 17:18:02 2019: (6) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76daa200]
[E] Thu Mar 7 17:18:02 2019: (7) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0xa0 [0x76da936c]
[E] Thu Mar 7 17:18:02 2019: (8) /usr/local/lib/libVC4CC.so.1.2 : std::_Function_handler<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&), std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> vc4c::precompilation::chainSteps<(vc4c::SourceType)3, (vc4c::SourceType)1, (vc4c::SourceType)3>(std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&, std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)3>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)> const&)::{lambda(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)#1}>::_M_invoke(std::_Any_data const&, vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)+0x5c [0x76dab630]
[E] Thu Mar 7 17:18:02 2019: (9) /usr/local/lib/libVC4CC.so.1.2 : std::function<void (vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&)>::operator()(vc4c::precompilation::PrecompilationSource<(vc4c::SourceType)1>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::precompilation::PrecompilationResult<(vc4c::SourceType)3>&) const+0x78 [0x76daa200]
[E] Thu Mar 7 17:18:02 2019: (10) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Precompiler::run(std::unique_ptr<std::istream, std::default_delete<std::istream> >&, vc4c::SourceType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >)+0x608 [0x76db0260]
[E] Thu Mar 7 17:18:02 2019: (11) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Precompiler::precompile(std::istream&, std::unique_ptr<std::istream, std::default_delete<std::istream> >&, vc4c::Configuration, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >)+0x164 [0x76dadfd8]
[E] Thu Mar 7 17:18:02 2019: (12) /usr/local/lib/libVC4CC.so.1.2 : vc4c::Compiler::compile(std::istream&, std::ostream&, vc4c::Configuration, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, vc4c::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)+0xc4 [0x76a915dc]
[E] Thu Mar 7 17:18:02 2019: (13) VC4C : main+0xf04 [0x375a4]
[E] Thu Mar 7 17:18:02 2019: (14) /lib/arm-linux-gnueabihf/libc.so.6 : __libc_start_main+0x114 [0x73bcb678]
[D] Thu Mar 7 17:18:02 2019: Temporary file '/tmp/vc4c-cPNtDX' deleted
[D] Thu Mar 7 17:18:02 2019: Temporary file '/tmp/vc4c-2aReX3' deleted
[E] Thu Mar 7 17:18:02 2019: Compiler threw exception: Pre-compilation: Error in precompilation: Illegal instruction
terminate called after throwing an instance of 'vc4c::CompilationError'
what(): Pre-compilation: Error in precompilation: Illegal instruction
Aborted
Are they related?
This is what clinfo returns right now:
Number of platforms 1
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Platform Vendor doe300
Platform Version OpenCL 1.2 VC4CL 0.4
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_il_program cl_khr_spir cl_khr_create_command_queue cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_vc4cl_performance_counters
Platform Extensions function suffix VC4CL
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices 1
Device Name VideoCore IV GPU
Device Vendor Broadcom
Device Vendor ID 0xa5c
Device Version OpenCL 1.2 VC4CL 0.4
Driver Version 0.4
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile EMBEDDED_PROFILE
Max compute units 1
Max clock frequency 300MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Max work item dimensions 3
Max work item sizes 12x12x12
Max work group size 12
Preferred work group size multiple <getWGsizes:498: build program : error -15>
Preferred / native vector sizes
char 16 / 16
short 16 / 16
int 16 / 16
long 0 / 0
half 0 / 0 (n/a)
float 16 / 16
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32, Little-Endian
Global memory size 134217728 (128MiB)
Error Correction support No
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 64 bytes
Alignment of base address 512 bits (64 bytes)
Global Memory cache type Read/Write
Global Memory cache size <printDeviceInfo:89: get CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : error -30>
Global Memory cache line 64 bytes
Image support No
Local memory type Global
Local memory size 134217728 (128MiB)
Max constant buffer size 134217728 (128MiB)
Max number of constant args 64
Max size of kernel argument 256
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 0
Built-in kernels
Device Available Yes
Compiler Available Yes
Linker Available Yes
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_nv_pragma_unroll cl_arm_core_id cl_ext_atomic_counters_32 cl_khr_initialize_memory cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) OpenCL for the Raspberry Pi VideoCore IV GPU
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [VC4CL]
clCreateContext(NULL, ...) [default] Success [VC4CL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Device Name VideoCore IV GPU
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.11
ICD loader Profile OpenCL 2.1
Khronos switched the format of its OpenCL headers to have a single version of headers while the OpenCL version to be used is defined by macros (see KhronosGroup/OpenCL-Headers@de26592).
This has not yet any effect on "default" builds, since Raspian still ships with the old format of OpenCL 1.2 specific headers, but introduces problems when the upstream headers are used to build this library.
See also #33 (comment)
Good news! I've finally managed to compile and install everything on my RPi and the tests look good...
Suite 'TestImage' finished, 8/10 successful (80%) in 276470 microseconds (276.47 ms).
Suite 'TestExecutions' finished, 13/15 successful (86.66%) in 100462261 microseconds (100462 ms).
Is this an ok result or should it be 100% for both? Are these all test suite or is something missing?
Cmake was called like this:
cmake .. -DBUILD_DEBUG=ON -DBUILD_DEB_PACKAGE=ON -DBUILD_TESTING=ON -DCROSS_COMPILE=ON -DBUILD_ICD=ON -DREGISTER_POKE_KERNELS=ON -DIMAGE_SUPPORT=ON
I am still getting some strange errors but since I build with -DIMAGE_SUPPORT=ON
perhaps that is the reason. Attached you find the 4285 line log with -DDEBUG=ON, perhaps it is helpful (or not)
May I consider VC4CL properly installed and ready for doing something useful with it? (such as porting a tensor flow lite model to OpenCL via code generator?)
I am begginer in the use of OpenCL, but when I execute the clinfo
command, I can check I have one platform with the following information:
Number of platforms 1
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Platform Vendor doe300
Platform Version OpenCL 1.2 VC4CL 0.4
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_il_program cl_altera_device_temperature cl_altera_live_object_tracking cl_khr_icd cl_vc4cl_performance_counters
Platform Extensions function suffix VC4CL
Platform Name OpenCL for the Raspberry Pi VideoCore IV GPU
Number of devices 1
Device Name VideoCore IV GPU
[...]
terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to enable QPUs!
Aborted
However, it fails and throws an exception before finishing. Could I solve this?
Cmake configuration failed when I tried cross-compilation, because the execution find_package(OpenCL REQUIRED)
failed.
In this execution, cmake tries to compile some c program using CMAKE_C_COMPILER
.
The problem is, this compiler run without any compiler-option to add include directories.
So, even though these exists ${CROSS_COMPILE}/include/CL
, finding OpenCL fails.
Probably CMakeLists.txt
need to be modified.
My configuration command is as follows. Do I miss some specification?
Of course, /opt/gcc-linaro-7.1.1-2017.08-x86_64_arm-linux-gnueabihf/include/CL
exists.
/opt/gcc-linaro-7.1.1-2017.08-x86_64_arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc -I/opt/gcc-linaro-7.1.1-2017.08-x86_64_arm-linux-gnueabihf/include
works correctly.
cmake -DBUILD_TESTING=ON -DCROSS_COMPILE=ON -DCROSS_COMPILER_PATH=/opt/gcc-linaro-7.1.1-2017.08-x86_64_arm-linux-gnueabihf
Have you considered integration with the vc4 kernel driver for dispatching your CL jobs? It could expose a root-only ioctl taking the start pointers and count and trusting the user. I've got some old code at https://github.com/anholt/linux/commits/rpi-4.4.y-qpu-execute that needed a user.
The other question I have is: For VC4 I was able to make my compiler emit code in such a way that the kernel could verify that we didn't exceed buffer bounds for a given set of buffers with a particular shader (texture samplers have a descriptor, uniform array access gets clamping on the addresses, uniform stream gets reset at branchpoints). Do you think this would be doable for your compiler to remove the root requirement?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.