clang-omp / clang Goto Github PK
View Code? Open in Web Editor NEWclang with OpenMP 3.1 and some elements of OpenMP 4.0 support
Home Page: clang-omp.github.com
License: Other
clang with OpenMP 3.1 and some elements of OpenMP 4.0 support
Home Page: clang-omp.github.com
License: Other
//===----------------------------------------------------------------------===// // C Language Family Front-end //===----------------------------------------------------------------------===// Welcome to Clang. This is a compiler front-end for the C family of languages (C, C++, Objective-C, and Objective-C++) which is built as part of the LLVM compiler infrastructure project. Unlike many other compiler frontends, Clang is useful for a number of things beyond just compiling code: we intend for Clang to be host to a number of different source-level tools. One example of this is the Clang Static Analyzer. If you're interested in more (including how to build Clang) it is best to read the relevant web sites. Here are some pointers: Information on Clang: http://clang.llvm.org/ Building and using Clang: http://clang.llvm.org/get_started.html Clang Static Analyzer: http://clang-analyzer.llvm.org/ Information on the LLVM project: http://llvm.org/ If you have questions or comments about Clang, a great place to discuss them is on the Clang development mailing list: http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev If you find a bug in Clang, please file it in the LLVM bug tracker: http://llvm.org/bugs/
Hello:
Is possible to link the object files generated with clang-omp against libgomp instead libiomp5?
Hi, first of all congratulation for the incredible work you did!
For my thesis I have to transform each pragma Task into a function. The problem is that I have to pass to this function the variables that are not instantiated inside the pragma but I can access easily only the variables expressed as clauses. Looking at the code I have seen that there is EmitOMPTaskDirective() which should do exactly what I need.
When I parse the code using RecursiveASTVisitor I can access the OMPExecutableDirective and catch the CapturedStmt but EmitOMPTaskDirective uses a lot of function from CodeGenFunction and CodeGenModule and I don’t know how to access to this objects.
Could you please give me same hints about how to gain access to CodeGenModule or if there is another way to solve my problem?
Thank you very much!
On x86_64-apple-darwin, clang-omp currently shows an apparent bogus test suite failure of...
FAIL: Clang :: OpenMP/target_driver_and_codegen.c (3899 of 7661)
******************** TEST 'Clang :: OpenMP/target_driver_and_codegen.c' FAILED ********************
Script:
--
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -### -fopenmp -omptargets=aaa-bbb-ccc-ddd /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-INVALID-TARGET /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -### -fopenmp -omptargets= /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-EMPTY-OMPTARGETS /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -### -fopenmp -omptargets=x86_64-apple-darwin /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-NO-SUPPORT /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -### -fopenmp -target powerpc64-linux -omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-COMMANDS /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m not /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -cc1 -internal-isystem /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib/clang/3.5.0/include "-fopenmp" "-omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda" "-triple" "powerpc64-ibm-linux-gnu" /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-MAINFILE /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m not /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -cc1 -internal-isystem /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib/clang/3.5.0/include "-fopenmp" "-omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda" "-omp-target-mode" "-triple" "nvptx64-nvidia-cuda" /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-MAINFILE /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m not /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -cc1 -internal-isystem /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib/clang/3.5.0/include "-fopenmp" "-omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda" "-triple" "powerpc64-ibm-linux-gnu" "-omp-main-file-path" "abcd.efgh" /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-MODULEID /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m not /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -cc1 -internal-isystem /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib/clang/3.5.0/include "-fopenmp" "-omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda" "-omp-target-mode" "-triple" "nvptx64-nvidia-cuda" /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c "-omp-main-file-path" "abcd.efgh" 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-MODULEID /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -### -fopenmp -target powerpc64-linux -omptargets=nvptx64sm_35-nvidia-cuda /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1 | /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-SUBTARGET /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/clang -S -emit-llvm -O0 -fopenmp -target powerpc64-linux -omptargets=powerpc64-ibm-linux-gnu,nvptx64-nvidia-cuda /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c 2>&1
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-CODEGEN-HOST -input-file=target_driver_and_codegen.ll /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-CODEGEN-TARGET1 -input-file=target_driver_and_codegen.tgt-nvptx64-nvidia-cuda.ll /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
gtimeout 1m /sw/src/fink.build/llvm35-3.5.0-0/build/stage3/./bin/FileCheck -check-prefix=CHK-CODEGEN-TARGET2 -input-file=target_driver_and_codegen.tgt-powerpc64-ibm-linux-gnu.ll /sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c
--
Exit Code: 1
Command Output (stderr):
--
/sw/src/fink.build/llvm35-3.5.0-0/cfe-3.5.0.src/test/OpenMP/target_driver_and_codegen.c:54:18: error: expected string not found in input
// CHK-COMMANDS: ld" {{.*}} "-o" "a.out" {{.*}} "[[HOSTOBJ]].o" "-liomp5" "-lomptarget" {{.*}} "-T" "[[LKSCRIPT:.+]].lk"
^
<stdin>:13:255: note: scanning from here
"/usr/bin/ld" "--eh-frame-hdr" "-m" "elf64ppc" "-shared" "-o" "/var/tmp/target_driver_and_codegen-b9f35e.so" "crti.o" "crtbeginS.o" "-L/sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib" "-L/usr/lib" "/var/tmp/target_driver_and_codegen-109979.o" "-L/sw/opt/llvm-3.5/lib" "-liomp5" "-lomptarget" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lpthread" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "crtendS.o" "crtn.o"
^
<stdin>:13:255: note: with variable "HOSTOBJ" equal to "/var/tmp/target_driver_and_codegen-489c39"
"/usr/bin/ld" "--eh-frame-hdr" "-m" "elf64ppc" "-shared" "-o" "/var/tmp/target_driver_and_codegen-b9f35e.so" "crti.o" "crtbeginS.o" "-L/sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib" "-L/usr/lib" "/var/tmp/target_driver_and_codegen-109979.o" "-L/sw/opt/llvm-3.5/lib" "-liomp5" "-lomptarget" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lpthread" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "crtendS.o" "crtn.o"
^
<stdin>:14:249: note: possible intended match here
"/usr/bin/ld" "--eh-frame-hdr" "-m" "elf64ppc" "-dynamic-linker" "/lib64/ld64.so.1" "-o" "a.out" "/usr/lib/crt1.o" "crti.o" "crtbegin.o" "-L/sw/src/fink.build/llvm35-3.5.0-0/build/stage3/bin/../lib" "-L/usr/lib" "/var/tmp/target_driver_and_codegen-489c39.o" "-L/sw/opt/llvm-3.5/lib" "-liomp5" "-lomptarget" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lpthread" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "crtend.o" "crtn.o" "-T" "/var/tmp/a-93ab58.lk"
^
--
********************
Any ideas on how to suppress this on x86_64-apple-darwin?
Hi,
I am working on some templated code and I don't succeed to get captured var.
I used to iterate through capture_begin(), capture_end() from the captured statement but it looks empty in template function. Is there another interface specific for templates?
Best,
Pierrick
Do the semantics of the omp simd construct allow us to add the ‘llvm.mem.parallel_loop_access‘ metadata to the memory accesses in a simd parallel loop? I think this is equivalent to have an infinite safelen.
Hi all,
I just tried this piece of code ( called test.c)
int main() {
int tid = 0;
tid = omp_get_thread_num();
printf("Hello from thread %d, nthreads %d\n", tid, omp_get_num_threads());
}
}
I compile it with the openmp clang and what I got is that the screen keeps displaying
test.c:10:2: warning: extra tokens at the end of '#pragma omp parallel' are
ignored [-Wextra-tokens]
and never finishes.
The command line I was using is
" clang -c -Wall -O3 -fopenmp -emit-llvm -I../common -I/home/bo/work/libomp_oss/exports/linux64/include test.c -o test.bc"
Could anyone help me here? Thanks in advance!
btw, "#pragma omp parallel for" works.
Bo
in ParseOpenMP.cpp:
/// flush-clause:
/// '(' list ')'
should read:
/// flush-clause:
/// 'flush' '(' list ')'
correct?
Hi,
I notice that for code like :
#pragma parallel for num_threads(16) firstprivate(n) lastprivate(n)
for(...)
{
}
clause are not correctly handle.
First, parallel and for are splited. It may not be an issue but clauses are forwarded to both when they exists. In the case above, it means:
#pragma omp parallel firstprivate(n) num_threads(16)
#pragma omp for firstprivate(n) lastprivate(n)
for(...)
{
}
You should notice that n
is firstprivate twice so that lastprivate applied on the n
in the parallel construct and not the global one.
I don't know how it is handle for the CodeGen but it is tricky when we use clang as a library.
Thanks,
Pierrick
For this simple STL helloworld:
#include <iostream>
#include <vector>
int main() {
std::vector<int> vec;
std::cout << vec.size() << std::endl;
return 0;
}
Compiled "ok" with the default clang compiler:
/usr/bin/clang++ clang.cpp
But if I compile it with the clang-omp I get this error:
/usr/local/bin/clang++ clang.cpp
clang.cpp:1:10: fatal error: 'iostream' file not found
#include <iostream>
^
1 error generated.
I know I am missing the STL library; but what's the proper way to proceed?
It is nice you provide source code for your changes, however it is completely detached from mainline LLVM code repository itself so it is hard really to track changes or to rebase it onto current LLVM trunk.
Therefore provide Git repo branched either from https://github.com/llvm-mirror
or http://llvm.org/git/llvm.git
.
Compiling this with clang-omp:
void tuned_STREAM_Scale(STREAM_TYPE scalar)
{
ssize_t j;
for (j=0; j<STREAM_ARRAY_SIZE; j++)
b[j] = scalar*c[j];
}
results in IR that looks like this for the main loop:
omp.lb_ub.check_pass: ; preds = %omp.lb.le.global_ub.
%17 = load double* %ref3, align 8, !tbaa !6
%18 = load i64* %j.private., align 8, !tbaa !8
%arrayidx = getelementptr inbounds [10000000 x double]* @c, i32 0, i64 %18
%19 = load double* %arrayidx, align 8, !tbaa !6
%mul5 = fmul double %17, %19
%20 = load i64* %j.private., align 8, !tbaa !8
%arrayidx6 = getelementptr inbounds [10000000 x double]* @b, i32 0, i64 %20
store double %mul5, double* %arrayidx6, align 8, !tbaa !6
br label %omp.cont.block
Please note that the captured parameter load that corresponds to 'scalar' in the original source:
%17 = load double* %ref3, align 8, !tbaa !6
is loaded in each loop iteration. Just as with other loads that needed hoisting in issue #27 , this load also needs to be hoisted.
The latest commit has introduced a regression. I can no longer compile my OpenMP n-body test:
$ git clone https://github.com/tycho/cudahandbook.git
Cloning into 'cudahandbook'...
remote: Counting objects: 2083, done.
remote: Compressing objects: 100% (889/889), done.
remote: Total 2083 (delta 1317), reused 1883 (delta 1178)
Receiving objects: 100% (2083/2083), 900.26 KiB | 730.00 KiB/s, done.
Resolving deltas: 100% (1317/1317), done.
Checking connectivity... done$ cd cudahandbook/nbody
$ make NO_CUDA=1 CXX=clang++ V=1
* rebuilding nbody: new build flags or prefix
clang++ -DUSE_OPENMP -DHAVE_SSE -DNO_CUDA -O3 -ffast-math -fno-strict-aliasing -fopenmp -pthread -xc++ -I../chLib -c -o nbody.o nbody.cu
clang++ -DUSE_OPENMP -DHAVE_SSE -DNO_CUDA -O3 -ffast-math -fno-strict-aliasing -fopenmp -pthread -I../chLib -c -o nbody_CPU_AOS.o nbody_CPU_AOS.cpp
clang++ -DUSE_OPENMP -DHAVE_SSE -DNO_CUDA -O3 -ffast-math -fno-strict-aliasing -fopenmp -pthread -I../chLib -c -o nbody_CPU_AOS_tiled.o nbody_CPU_AOS_tiled.cpp
nbody_CPU_AOS_tiled.cpp:157:18: error: variable 'symmetricX' with variably modified type cannot be captured
force[3_j+0] += symmetricX[_j];
^
nbody_CPU_AOS_tiled.cpp:106:11: note: 'symmetricX' declared here
float symmetricX[nTile];
^
nbody_CPU_AOS_tiled.cpp:159:18: error: variable 'symmetricY' with variably modified type cannot be captured
force[3_j+1] += symmetricY[_j];
^
nbody_CPU_AOS_tiled.cpp:107:11: note: 'symmetricY' declared here
float symmetricY[nTile];
^
nbody_CPU_AOS_tiled.cpp:161:18: error: variable 'symmetricZ' with variably modified type cannot be captured
force[3_j+2] += symmetricZ[j];
^
nbody_CPU_AOS_tiled.cpp:108:11: note: 'symmetricZ' declared here
float symmetricZ[nTile];
^
3 errors generated.
make: ** [nbody_CPU_AOS_tiled.o] Error 1
Note that this same code built fine under the revision before this ("Fixed DSA processing"), and still builds/runs correctly under GCC with OpenMP enabled.
Any ideas what's going on? This is the relevant function:
static void
DoNondiagonalTile(
size_t nTile,
float *force,
float const * const posMass,
float softeningSquared,
size_t iTile, size_t jTile
)
{
float symmetricX[nTile];
float symmetricY[nTile];
float symmetricZ[nTile];
memset( symmetricX, 0, sizeof(symmetricX) );
memset( symmetricY, 0, sizeof(symmetricY) );
memset( symmetricZ, 0, sizeof(symmetricZ) );
for ( size_t _i = 0; _i < nTile; _i++ )
{
const size_t i = iTile*nTile+_i;
float ax = 0.0f, ay = 0.0f, az = 0.0f;
const float myX = posMass[i*4+0];
const float myY = posMass[i*4+1];
const float myZ = posMass[i*4+2];
for ( size_t _j = 0; _j < nTile; _j++ ) {
const size_t j = jTile*nTile+_j;
float fx, fy, fz;
const float bodyX = posMass[j*4+0];
const float bodyY = posMass[j*4+1];
const float bodyZ = posMass[j*4+2];
const float bodyMass = posMass[j*4+3];
bodyBodyInteraction<float>(
&fx, &fy, &fz,
myX, myY, myZ,
bodyX, bodyY, bodyZ, bodyMass,
softeningSquared );
ax += fx;
ay += fy;
az += fz;
symmetricX[_j] -= fx;
symmetricY[_j] -= fy;
symmetricZ[_j] -= fz;
}
#pragma omp atomic update
force[3*i+0] += ax;
#pragma omp atomic update
force[3*i+1] += ay;
#pragma omp atomic update
force[3*i+2] += az;
}
for ( size_t _j = 0; _j < nTile; _j++ ) {
const size_t j = jTile*nTile+_j;
#pragma omp atomic update
force[3*j+0] += symmetricX[_j];
#pragma omp atomic update
force[3*j+1] += symmetricY[_j];
#pragma omp atomic update
force[3*j+2] += symmetricZ[_j];
}
}
The validation suite results reported on clang-omp's homepage says that
OpenMP Validation Suite by OpenUH Research Compiler - passed 119 tests of 123
But on my machine, only 118 test are passed, one less than reported above. I want to
figure out which test failed unexpectly, so could anyone attach validation results with 119 passed
tests here?
thanks!
--------------------------------- my results.txt ---------------------------
#Tested Directive t ct ot oct
has_openmp 100 100 100 100
omp_atomic 100 95 100 85
omp_barrier 100 100 100 100
omp_critical 100 0 100 15
omp_flush 100 100 100 100
omp_for_firstprivate 100 100 100 100
omp_for_lastprivate 100 100 100 85
omp_for_ordered 100 100 100 100
omp_for_private 100 100 100 100
omp_for_reduction 100 100 100 100
omp_for_schedule_dynamic 100 100 100 100
omp_for_schedule_guided 100 100 100 100
omp_for_schedule_static 100 100 100 100
omp_for_nowait 100 100 100 100
omp_get_num_threads 100 100 100 100
omp_get_wtick 100 100 100 100
omp_get_wtime 100 100 100 100
omp_in_parallel 100 100 100 100
omp_lock 100 100 100 100
omp_master 100 100 100 100
omp_nest_lock 100 100 100 100
omp_parallel_copyin 100 100 100 100
omp_parallel_for_firstprivate 100 100 100 100
omp_parallel_for_lastprivate 100 100 100 100
omp_parallel_for_ordered 100 100 100 100
omp_parallel_for_private 100 100 100 100
omp_parallel_for_reduction 100 100 100 100
omp_parallel_num_threads 100 100 100 100
omp_parallel_sections_firstprivate 100 100 100 100
omp_parallel_sections_lastprivate 100 100 100 100
omp_parallel_sections_private 100 100 100 100
omp_parallel_sections_reduction 100 15 100 10
omp_section_firstprivate 100 100 100 100
omp_section_lastprivate 100 100 100 100
omp_section_private 100 100 100 100
omp_sections_reduction 100 30 100 45
omp_sections_nowait 100 100 100 100
omp_parallel_for_if 100 100 100 100
omp_single_copyprivate 100 100 100 100
omp_single_nowait 100 100 100 100
omp_single_private 100 100 100 100
omp_single 100 100 100 100
omp_test_lock 100 100 100 100
omp_test_nest_lock 100 100 100 100
omp_threadprivate 100 100 - -
omp_parallel_default 100 100 100 100
omp_parallel_shared 100 100 100 100
omp_parallel_private 100 100 100 100
omp_parallel_firstprivate 100 100 100 100
omp_parallel_if 100 100 100 100
omp_parallel_reduction 100 100 100 100
omp_for_collapse 100 100 100 100
omp_master_3 100 100 100 100
omp_task 100 100 100 100
omp_task_if 100 100 100 100
omp_task_untied 0 - 0 -
omp_task_shared 100 100 100 100
omp_task_private 100 100 100 100
omp_task_firstprivate 100 100 100 100
omp_taskwait 100 100 100 100
omp_taskyield 100 139 10 -
omp_task_final 0 - 0 -
The following code triggers an assertion failure:
void f(void) {
double data[1][1];
long one = 1L;
typedef double (*d_t)[one];
d_t r = data;
#pragma omp task depend(out: r[0][0])
r[0][0] = 0.0;
}
I believe the abort happens when the current token is the first '[' in the depend clause. The message is the following:
clang: .../include/clang/AST/Type.h:547: const clang::ExtQualsTypeCommonBase *clang::QualType::getCommonPtr() const: Assertion `!isNull() && "Cannot retrieve a NULL type pointer"' failed.
0 clang 0x00000000045d84fe llvm::sys::PrintStackTrace(_IO_FILE*) + 46
1 clang 0x00000000045d87bb
2 clang 0x00000000045d8a2e
3 libpthread.so.0 0x00007ffff79b85d0
4 libc.so.6 0x00007ffff68b9945 gsignal + 53
5 libc.so.6 0x00007ffff68baf21 abort + 385
6 libc.so.6 0x00007ffff68b2810 __assert_fail + 240
7 clang 0x000000000174d775 clang::QualType::getCommonPtr() const + 69
8 clang 0x000000000174d395 clang::QualType::getTypePtr() const + 21
9 clang 0x0000000001754155 clang::QualType::operator->() const + 21
10 clang 0x0000000001febddd clang::Sema::tryCaptureVariable(clang::VarDecl*, clang::SourceLocation, clang::Sema::TryCaptureKind, clang::SourceLocation, bool, clang::QualType&, clang::QualType&, unsigned int const*) + 1997
11 clang 0x0000000001fc2c45 clang::Sema::getCapturedDeclRefType(clang::VarDecl*, clang::SourceLocation) + 165
12 clang 0x0000000001fc2536 clang::Sema::BuildDeclarationNameExpr(clang::CXXScopeSpec const&, clang::DeclarationNameInfo const&, clang::NamedDecl*, clang::NamedDecl*, clang::TemplateArgumentListInfo const*) + 1638
13 clang 0x0000000001fc0e98 clang::Sema::BuildDeclarationNameExpr(clang::CXXScopeSpec const&, clang::LookupResult&, bool) + 168
14 clang 0x0000000001e9332d clang::Sema::ClassifyName(clang::Scope*, clang::CXXScopeSpec&, clang::IdentifierInfo*&, clang::SourceLocation, clang::Token const&, bool, clang::CorrectionCandidateCallback*) + 5389
15 clang 0x0000000001d2bf78 clang::Parser::TryAnnotateName(bool, clang::CorrectionCandidateCallback*) + 1000
16 clang 0x0000000001da736e clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, bool, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) + 1070
17 clang 0x0000000001da6e25 clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, bool, clang::SourceLocation*) + 133
18 clang 0x0000000001da6d29 clang::Parser::ParseStatement(clang::SourceLocation*) + 89
19 clang 0x0000000001d9d15e clang::Parser::ParseOpenMPDeclarativeOrExecutableDirective(bool) + 3342
20 clang 0x0000000001da7b25 clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, bool, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) + 3045
21 clang 0x0000000001da6e25 clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, bool, clang::SourceLocation*) + 133
22 clang 0x0000000001dadd8a clang::Parser::ParseCompoundStatementBody(bool) + 1418
23 clang 0x0000000001db0e4f clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) + 319
24 clang 0x0000000001d2b0dc clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) + 3708
25 clang 0x0000000001d41383 clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, unsigned int, bool, clang::SourceLocation*, clang::Parser::ForRangeInit*) + 467
26 clang 0x0000000001d2a24f clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) + 1215
27 clang 0x0000000001d299d3 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) + 147
28 clang 0x0000000001d2916e clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) + 3502
29 clang 0x0000000001d28358 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) + 616
30 clang 0x0000000001d241d6 clang::ParseAST(clang::Sema&, bool, bool) + 326
31 clang 0x00000000017982c9 clang::ASTFrontendAction::ExecuteAction() + 345
32 clang 0x0000000001a6cfee clang::CodeGenAction::ExecuteAction() + 1246
33 clang 0x0000000001797def clang::FrontendAction::Execute() + 191
34 clang 0x0000000001763c10 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 800
35 clang 0x0000000001728dc8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 1048
36 clang 0x000000000171804a cc1_main(char const**, char const**, char const*, void*) + 698
37 clang 0x0000000001723024 main + 772
38 libc.so.6 0x00007ffff68a5bc6 __libc_start_main + 230
39 clang 0x0000000001717509
Stack dump:
0. Program arguments: .../bin/clang -cc1 -fopenmp -triple x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-name bug5.c -mrelocation-model static -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -target-linker-version 2.23.2 -coverage-file .../bug5.o -resource-dir .../bin/../lib/clang/3.4 -I... -c-isystem ... -cxx-isystem ... -internal-isystem ... -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir ... -ferror-limit 19 -fmessage-length 185 -mstackrealign -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-slp -o bug5.o -x c bug5.c
1. bug5.c:9:2: current parser token 'r'
2. bug5.c:1:14: parsing function body 'f'
3. bug5.c:1:14: in compound statement ('{}')
clang: error: unable to execute command: Aborted
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.4 (https://github.com/clang-omp/clang 467c95dbd1ef080ff4672d10f164367a52b90339) (https://github.com/clang-omp/llvm 92414b1167a33c0ab9187c72b098a54ecbffc15c)
Target: x86_64-unknown-linux-gnu
Thread model: posix
BTW, I could not find neither an openmp branch nor an openmp component in the main llvm bugzilla. So I hope this is the right place to submit the bug report.
The project is lacking a mailing list, or a support forum, where interested users (like me 😄 ) can ask questions like: what is the roadmap for actually integrating OpenMP support in official Clang releases?
Building current clang-omp by applying a diff of current clang-omp at commit f9e2fd7 vs the stock clang 3.4 sources from llvm.org over the clang 3.4.1 sources, builds fine on x86_64-apple-darwin11/12 using the llvm34 four stage bootstrap build of llvm/compiler-rt/clang/polly. However the resulting build exhibits the following segfault in the the OpenMP3.1_Validation test suite (with CC = /sw/opt/llvm-3.4/bin/clang set in the Makefile).
Testing for "omp_taskyield":
Generating sources .............. success
Compiling soures ................ success
Running test with 8 threads ..... success ...sh: line 1: 38338 Segmentation fault: 11 ./bin/c/ctest_omp_taskyield > bin/c/ctest_omp_taskyield.out
... and verified with 139% certainty
This failure is observed on both darwin11 with openmp built with clang from Xcode 4.6.3 and darwin12 with openmp built with clang from Xcode 5.1.1.
On darwin11, the segault backtraces as…
howarth% lldb ./ctest_omp_taskyield
Current executable set to './ctest_omp_taskyield' (x86_64).
(lldb) r
Process 38490 launched: './ctest_omp_taskyield' (x86_64)
Testing omp taskyield
Process 38490 stopped
vfprintf_l + 94, stop reason = EXC_BAD_ACCESS (code=1, address=0x10) frame #0: 0x00007fff944cffa6 libsystem_c.dylib
vfprintf_l + 94vfprintf_l + 94, stop reason = EXC_BAD_ACCESS (code=1, address=0x10) frame #0: 0x00007fff944cffa6 libsystem_c.dylib
vfprintf_l + 94fprintf + 168 frame #2: 0x00000001000017d7 ctest_omp_taskyield
main + 167The summary of the test suite on x86_64-apple-darwin11 for clang-omp merged into clang 3.4.1 shows…
Summary:
S Number of tested Open MP constructs: 62
S Number of used tests: 123
S Number of failed tests: 7
S Number of successful tests: 116
S + from this were verified: 88
Normal tests:
N Number of failed tests: 4
N + from this fail compilation: 0
N + from this timed out 0
N Number of successful tests: 58
N + from this were verified: 43
Orphaned tests:
O Number of failed tests: 3
O + from this fail compilation: 0
O + from this timed out 0
O Number of successful tests: 58
O + from this were verified: 45
I have built clang-omp
successfully in my Mac, and I would like to use it in Xcode 5.1.1. It would be nice if this instruction can be shown in the main webpage. Thanks for the great work, and hope it can be integrated into clang
!
Consider the following nifty testcase:
int main() {
long nthreads = 4;
#pragma omp parallel num_threads(nthreads)
{
}
}
This currently crashes the IRgen. Apparently, bitcasts are missed somewhere :)
The code in TreeTransform::TransformOMPNumTeamsClause calls getDerived().RebuildOMPSimdlenClause -- this seems wrong.
template
OMPClause *
TreeTransform::TransformOMPNumTeamsClause(OMPNumTeamsClause *C) {
// Transform the number-of-teams expession.
ExprResult E = getDerived().TransformExpr(C->getNumTeams());
if (E.isInvalid())
return 0;
return getDerived().RebuildOMPSimdlenClause(E.take(),
C->getLocStart(),
C->getLocEnd());
}
Here's a simple example:
$ cat /tmp/o.c
static double x[5000];
void foo(double *y) {
for (int i = 0; i < 5000; ++i)
x[i] = y[i];
}
double bar(int i) {
return x[i];
}
Without -fopenmp, LLVM can vectorize this loop. With -fopenmp the code quality is horrible (it is bad even for scalar code). There are three reasons why. Here's the "optimized" loop body:
omp.lb_ub.check_pass: ; preds = %omp.loop.init, %omp.lb_ub.check_pass
%.idx..014 = phi i32 [ %.next.idx., %omp.lb_ub.check_pass ], [ %7, %omp.loop.init ]
%rem = srem i32 %.idx..014, 5000
%idxprom = sext i32 %rem to i64
%ref = load double*** %6, align 8, !tbaa !5
%9 = load double** %ref, align 8, !tbaa !1
%arrayidx = getelementptr inbounds double* %9, i64 %idxprom
%10 = load double* %arrayidx, align 8, !tbaa !6
%arrayidx5 = getelementptr inbounds [5000 x double]* @x, i64 0, i64 %idxprom
store double %10, double* %arrayidx5, align 8, !tbaa !6
%.next.idx. = add i32 %.idx..014, 1
%omp.idx.le.ub = icmp sgt i32 %.next.idx., %8
br i1 %omp.idx.le.ub, label %omp.loop.fini, label %omp.lb_ub.check_pass
Captured parameter dereferencing is done inside the loop.
%ref = load double*** %6, align 8, !tbaa !5
%9 = load double** %ref, align 8, !tbaa !1
This must be done outside the loop. The optimizer has no way of knowing that the ability to dereference these pointers is not conditioned on the loop executing (on the conditional check in the omp.loop.init: block). As a result, it cannot hoist them, and so nothing else happens. The fact that these pointers can always be dereferenced is semantic information that only the frontend has (since that is where the outlining is done), and we must host these things during Clang's codegen.
The remainder calculation in the loop:
%rem = srem i32 %.idx..014, 5000
why is this here? does the runtime require it (it needs to be fixed regardless)?
for.inc: ; preds = %for.body
%5 = load i32* %i, align 4, !tbaa !5
%inc = add nsw i32 %5, 1
store i32 %inc, i32* %i, align 4, !tbaa !5
br label %for.cond
notice the nsw flag on the loop increment. Having the nsw flag is helpful for later loop analysis. With -fopenmp, clang generates:
omp.cont.block: ; preds = %omp.lb_ub.check_pass
%.idx.6 = load i32* %.idx.
%.next.idx. = add i32 %.idx.6, 1
store i32 %.next.idx., i32* %.idx.
br label %omp.loop.main
and our nsw flag as disappeared.
Hello,
I've just followed the instructions on http://clang-omp.github.io/ in order to compile an openmp supported version of clang, and using CMake to build it, I came to this error:
-- Found Subversion: /usr/bin/svn (found version "1.7.10")
CMake Error at cmake/modules/LLVMProcessSources.cmake:89 (message):
Found unknown source file
/Users/myself/tmp/llvm/tools/clang/lib/CodeGen/CGLoopInfo.cpp
Please update
/Users/myself/tmp/llvm/tools/clang/lib/CodeGen/CMakeLists.txt
Call Stack (most recent call first):
cmake/modules/LLVMProcessSources.cmake:42 (llvm_check_source_file_list)
tools/clang/CMakeLists.txt:238 (llvm_process_sources)
tools/clang/lib/CodeGen/CMakeLists.txt:12 (add_clang_library)
But no sign of 'CGLoopInfo.cpp' anywhere, and llvm/tools/clang/lib/CodeGen/CMakeLists.txt doesn't seem to mention it either.
Is there something I am doing wrong?
Thank you.
Hi,
We've been working on updating our tool to the last clang-omp version, and we encountered a weird bug.
It's reproducible with the simple example provided by clang's documentation :
// Declares clang::SyntaxOnlyAction.
#include "clang/Frontend/FrontendActions.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
// Declares llvm::cl::extrahelp.
#include "llvm/Support/CommandLine.h"
using namespace clang::tooling;
using namespace llvm;
// Apply a custom category to all command-line options so that they are the
// only ones displayed.
static cl::OptionCategory MyToolCategory("my-tool options");
// CommonOptionsParser declares HelpMessage with a description of the common
// command-line options related to the compilation database and input files.
// It's nice to have this help message in all tools.
static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);
// A help message for this specific tool can be added afterwards.
static cl::extrahelp MoreHelp("\nMore help text...");
int main(int argc, const char **argv) {
CommonOptionsParser OptionsParser(argc, argv, MyToolCategory);
ClangTool Tool(OptionsParser.getCompilations(),
OptionsParser.getSourcePathList());
return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>().get());
}
Compilation can be done with no particular compiler since the bug occurs at runtime, eg you can use :
g++ simple_tool.cpp -std=c++11 -lclangFrontendTool -lclangFrontend -lclangDriver -lclangSerialization -lclangCodeGen -lclangParse -lclangSema -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangARCMigrate -lclangRewriteFrontend -lclangRewriteCore -lclangEdit -lclangAST -lclangLex -lclangBasic -lclangTooling `llvm-config --cxxflags --ldflags --libs all --system-libs` -fno-rtti
Given this very simple file
int main()
{
return 0;
}
I assume the execution should not fail with or without the -fopenmp
, however this is what I get :
$ ./a.out foo.c --
$ echo $?
0
$ ./a.out foo.c -- -fopenmp
Unable to create unique ID to input file - invalid input file status??
UNREACHABLE executed at /home/fifi/inria/compilateur/openmpclang/llvm/tools/clang/lib/Driver/Tools.cpp:2597!
zsh: abort ./a.out foo.c -- -fopenmp
I've looked a bit into it, CommonOptionsParser is actually trying to open "placeholder.cpp", which is a dummy file added by the CompilationDatabase stuff to the compiler arguments.
I don't know why adding this particular flag leads to this error, but it's definitely a regression since it did work before the merge of 3.5.
I would appreciate a reasonably well updated statement either here in the README.md or on the main website about what the timeline is in getting OpenMP support merged into mainline LLVM/Clang. Thanks.
clang-omp doesn't seem to be defining _OPENMP with -fopenmp on the command line. Is this expected?
Source code used to produce, "chokes" with -fopenmp, but does not choke if -D_OPENMP is also given on the command line:
#ifndef _OPENMP
choke me
#endif
#include <omp.h>
int main () { return omp_get_num_threads (); }
Example:
/usr/local/bin/clang-3.5omp -fopenmp -I/usr/local/include/openmp -o conftest ./conf_omp_test.c
./conf_omp_test.c:2:4: error: unknown type name 'choke'
choke me
^
./conf_omp_test.c:2:12: error: expected ';' after top level declarator
choke me
^
;
2 errors generated.
/usr/local/bin/clang-3.5omp -fopenmp -I/usr/local/include/openmp -D_OPENMP -o conftest ./conf_omp_test.c
In the following code, xrange_iterator
is a random access iterator but clang++ states the opposite because of the auto
keyword.
Turning auto
into xrange_iterator
solves the issue. The same problem happen when the type of the index variable comes from a nested type specifier (typename foo<T>::bar
or from a decltype
#include <iterator>
#include <algorithm>
/* xrange */
struct xrange_iterator : std::iterator< std::random_access_iterator_tag, long >{
long value;
long step;
long sign;
xrange_iterator() {}
xrange_iterator(long v, long s) : value(v), step(s), sign(s<0?-1:1) {}
long operator*() const { return value; }
long operator[](long n) const { return value + step *n; }
xrange_iterator& operator++() { value+=step; return *this; }
xrange_iterator operator++(int) { xrange_iterator self(*this); value+=step; return self; }
xrange_iterator& operator+=(long n) { value+=step*n; return *this; }
xrange_iterator& operator--() { value-=step; return *this; }
xrange_iterator operator--(int) { xrange_iterator self(*this); value-=step; return self; }
xrange_iterator& operator-=(long n) { value-=step*n; return *this; }
bool operator!=(xrange_iterator const& other) const { return value != other.value; }
bool operator==(xrange_iterator const& other) const { return value == other.value; }
bool operator<(xrange_iterator const& other) const { return sign*value < sign*other.value; }
bool operator<=(xrange_iterator const& other) const { return sign*value <= sign*other.value; }
bool operator>(xrange_iterator const& other) const { return sign*value > sign*other.value; }
bool operator>=(xrange_iterator const& other) const { return sign*value >= sign*other.value; }
long operator-(xrange_iterator const& other) const { return (value - other.value)/step; }
};
struct xrange {
long _begin;
long _end;
long _step;
long _last;
typedef long value_type;
typedef xrange_iterator iterator;
typedef xrange_iterator const_iterator;
void _init_last() {
if(_step>0) _last= _begin + std::max(0L,_step * ( (_end - _begin + _step -1)/ _step));
else _last= _begin + std::min(0L,_step * ( (_end - _begin + _step +1)/ _step)) ;
}
xrange(){}
xrange( long b, long e , long s=1) : _begin(b), _end(e), _step(s) { _init_last(); }
xrange( long e ) : _begin(0), _end(e), _step(1) { _init_last(); }
xrange_iterator begin() const { return xrange_iterator(_begin, _step); }
xrange_iterator end() const { return xrange_iterator(_last, _step); }
};
#include<iostream>
#include <omp.h>
int main() {
xrange r(1000000);
#pragma omp parallel for
for(auto b = r.begin(); b < r.end() ; ++b)
std::cout << omp_get_thread_num() << ":" << *b* *b * *b << "\n";
return 0;
}
Hello again,
Still using this page (http://clang.llvm.org/get_started.html) to compile clang, NOT using CMake this time, I ran into this compilation error:
llvm[3]: Compiling MSP430SelectionDAGInfo.cpp for Debug+Asserts build
Included from /Users/myself/tmp/llvm/lib/Target/NVPTX/NVPTX.td:19:
/Users/myself/tmp/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td:160:1: error: def 'do_SQRTF32_APPROX' already defined
def do_SQRTF32_APPROX : Predicate<"do_SQRTF32_PREC==0">;
^
make[3]: *** [/Users/vincentboucheny/tmp/build/lib/Target/NVPTX/Debug+Asserts/NVPTXGenAsmWriter.inc.tmp] Error 1
make[2]: *** [NVPTX/.makeall] Error 2
make[2]: *** Waiting for unfinished jobs....
If I comment the lines 160 and 161 of the problematic file, the compilation completes successfully.
Hi all,
I am wondering if the current version supports KMP_AFFINITY which I understand should be in the scope of OpenMP 4.0. I plan to try this on Intel Xeon Phi. Thanks!
Bo
Hi,
As dependencies appears in the clang-omp version, I try them and clang crash on a cholesky case.
code is :
#include <string.h>
#include <stdio.h>
#include <math.h>
#include <sys/types.h>
#include <stdlib.h>
#include <errno.h>
#include <atlas/cblas.h>
#include <atlas/clapack.h> /* assume MKL/ATLAS clapack version */
int clapack_dpotrf(const enum ATLAS_ORDER Order, const enum ATLAS_UPLO Uplo,
const int N, double *A, const int lda);
void cblas_dtrsm(const enum CBLAS_ORDER Order, const enum CBLAS_SIDE Side,
const enum CBLAS_UPLO Uplo, const enum CBLAS_TRANSPOSE TransA,
const enum CBLAS_DIAG Diag, const int M, const int N,
const double alpha, const double *A, const int lda,
double *B, const int ldb);
void cblas_dsyrk(const enum CBLAS_ORDER Order, const enum CBLAS_UPLO Uplo,
const enum CBLAS_TRANSPOSE Trans, const int N, const int K,
const double alpha, const double *A, const int lda,
const double beta, double *C, const int ldc);
void cblas_dgemm(const enum CBLAS_ORDER Order, const enum CBLAS_TRANSPOSE TransA,
const enum CBLAS_TRANSPOSE TransB, const int M, const int N,
const int K, const double alpha, const double *A, const int lda, const double *B, const int ldb,
const double beta, double *C, const int ldc);
/* Generate a random matrix symetric definite positive matrix of size m x m
- it will be also interesting to generate symetric diagonally dominant
matrices which are known to be definite postive.
*/
static void generate_matrix(double* A, size_t m)
{
//
for (size_t i = 0; i< m; ++i)
{
for (size_t j = 0; j< m; ++j)
A[i*m+j] = 1.0 / (1.0+i+j);
A[i*m+i] = m*1.0;
}
}
/* Block Cholesky factorization A <- L * L^t
Lower triangular matrix, with the diagonal, stores the Cholesky factor.
*/
void Cholesky( double* AA, int N, size_t blocsize )
{
double (*A)[N][N] = (double (*)[N][N])&AA[0];
#pragma omp parallel
#pragma omp single
for (size_t k=0; k < N; k += blocsize)
{
#pragma omp task shared(A, blocsize, N) depend(inout: A[k:blocsize][k:blocsize])
clapack_dpotrf(
CblasRowMajor, CblasLower, blocsize, &(*A)[k][k], N
);
for (size_t m=k+blocsize; m < N; m += blocsize)
{
#pragma omp task shared(A, blocsize, N) \
depend(inout: A[m:blocsize][k:blocsize]) depend(in: A[k:blocsize][k:blocsize])
cblas_dtrsm
(
CblasRowMajor, CblasLeft, CblasLower, CblasNoTrans, CblasUnit,
blocsize, blocsize, 1., &(*A)[k][k], N, &(*A)[m][k], N
);
}
for (size_t m=k+blocsize; m < N; m += blocsize)
{
#pragma omp task shared(A, blocsize, N) \
depend(inout: A[m:blocsize][m:blocsize]) depend(in: A[m:blocsize][k:blocsize])
cblas_dsyrk
(
CblasRowMajor, CblasLower, CblasNoTrans,
blocsize, blocsize, -1.0, &(*A)[m][k], N, 1.0, &(*A)[m][m], N
);
for (size_t n=k+blocsize; n < m; n += blocsize)
{
#pragma omp task shared(A, blocsize, N) \
depend(inout: A[m:blocsize][m:blocsize]) depend(in: A[m:blocsize][k:blocsize], A[n:blocsize][k:blocsize])
cblas_dgemm
(
CblasRowMajor, CblasNoTrans, CblasTrans,
blocsize, blocsize, blocsize, -1.0, &(*A)[m][k], N, &(*A)[n][k], N, 1.0, &(*A)[m][n], N
);
}
}
}
}
/* Do one run for cholesky
*/
void doone_exp( int N, int block_count )
{
size_t blocsize = N / block_count;
printf("N : %i\n", N);
printf("size block: %i\n", blocsize);
printf("#blocks : %i\n", block_count);
double* A = 0;
if (0 != posix_memalign((void**)&A, 4096, N*N*sizeof(double)))
{
printf("Fatal Error. Cannot allocate matrice A, errno: %i\n", errno);
return;
}
generate_matrix(A, N);
Cholesky(A, N, blocsize);
free(A);
}
/* main entry point
*/
int main(int argc, char** argv)
{
// matrix dimension
int n = 32;
if (argc > 1)
n = atoi(argv[1]);
// block count
int block_count = 2;
if (argc > 2)
block_count = atoi(argv[2]);
doone_exp( n, block_count );
return 0;
}
I compile it with atlas cblas/lapack version:
$> clang cholesky_inplace.c -lcblas -llapack_atlas -fopenmp
It works fine without -fopenmp but crash with it.
Regards,
Pierrick
I've done some initial benchmarking with the clang compiler from our llvm34-3.4.1-0e package in the fink project (which has openmp svn r208472 from llvm.org built against llvm/compiler-rt/clang 3.4.1 with the changes in clang-omp git at commit f9e2fd7 vs stock clang 3.4 applied). Using the heated_plate_openmp.c demo code and the heated_plate_gcc.sh shell script to collect timings for one, two and four OMP processes, the following ratios of these timings (normalized to the one OMP process timing) were observed…
on a 16-core MacPro under darwin13
1:1.90:3.31 for FSF gcc 4.8.3
1:1.90:3.30 for FSF gcc 4.9.0
1:1.99:3.71 for clang 3.4.1 with openmp and the clang-omp merge
on a 24-core Fedora 15 linux box
1:1.99:3.92 for FSF gcc 4.6.3
1:1.99:3.93 for FSF gcc 4.8 branch svn
I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61333 on FSF gcc concerning the dramatically lower performance on darwin for gomp compared to either iomp5 on darwin or gomp on linux. The FSF gcc developers cited the use of futex on linux compared to pthread_mutex calls.
We still see a 5.6% performance loss for iomp5 on darwin compared to the use of futex in gomp on linux. FYI, the test code and shell script used is attached to the PR 61333 FSF gcc bugzilla report in case the pthread_mutex usage on darwin can be tweaked to approach closer to the results for futex on linux.
Hi,
I'm currently working on a homebrew formula to automatically install clang-omp (as described on clang-omp.github.io) on Mac OS X. Currently it pulls the head, which isn't feasible for homebrew, as it heavily relies on proper versioning of the source code.
The easiest solution for that would be, if you can use tags to identify versions for your repositories "llvm", "compiler-rt" and "clang". Any numeric tags would be suitable, e.g. semver, dates or build numbers.
Keep up the great work!
When trying to use on Mac Mavericks I get the following error:
hello.c:1:10: fatal error: 'omp.h' file not found
I have the following program:
#include <omp.h>
#include <stdio.h>
int main() {
omp_set_num_threads(4);
#pragma omp parallel
printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads());
}
And I compiled it with clang -fopenmp
and I got the output
Hello from thread 0, nthreads 1
Clearly the set number of threads function is not called. Does anyone have ideas about why this might happen?
clang version:
Ubuntu clang version 3.5.0-4ubuntu2 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
Target: x86_64-pc-linux-gnu
Thread model: posix
EDIT: I also tried setting the environment variable OMP_NUM_THREADS
to 4.
iOS doesn't allow the use of dynamically-linked libraries in submitted apps. Is it possible to link the runtime statically?
Here's the output --
helpers.c:986:23: error: statement form is not allowed for '#pragma omp atomic read'
ATOMIC_READ_CHAR (n = task[t].info.needed);
^
helpers.c:140:3: note: expanded from macro 'ATOMIC_READ_CHAR'
stmt;
^
helpers.c:1024:22: error: statement form is not allowed for '#pragma omp atomic write'
ATOMIC_WRITE_CHAR (untaken_out = (untaken_out + 1) & QMask);
^
helpers.c:148:3: note: expanded from macro 'ATOMIC_WRITE_CHAR'
stmt;
^
helpers.c:2092:24: error: statement form is not allowed for '#pragma omp atomic write'
ATOMIC_WRITE_CHAR (untaken_in = (untaken_in+1) & QMask);
^
helpers.c:148:3: note: expanded from macro 'ATOMIC_WRITE_CHAR'
stmt;
^
helpers.c:2099:24: error: statement form is not allowed for '#pragma omp atomic write'
ATOMIC_WRITE_CHAR (untaken_in = (untaken_in+1) & QMask);
And here's an example of the macros:
_Pragma("omp atomic read")
stmt;
} while (0)
Hello,
I cloned and compiled the compiler successfully, however compiling code with it has not worked out for me. It fails to compile anything but the most trivial code because it can not find standard c++ include headers such as <cmath> or <string>. I managed to fiddle around with flags handed to various modules that I need to compile to make my stuff work, but I can't keep this up. I have dozens of dependencies and can't bring myself to spend the time to look into each one of them to coerce them to find the headers in the locations they seem to reside for the clang installed through OSX 10.9 developer tools (/usr/include/c++/4.2.1/ ...). Is there any way to teach the OpenMP/LLVM where to find the standard libraries? Should they not actually also be part of the install?
I've compiled the clang-omp following the instructions on http://clang-omp.github.io/#try-openmp-clang All were ok and the compiler together with the ntel OpenMP RT works fine. But I have inspected the local folder in which I have installed the software and it is too big: 4.6 GB. The bin/ folder occupies 1.7 GB (clang executable is 535.3 MB!), and the lib/ one 4.6 GB.
The compilation proccess was simply:
./configure --prefix=/mu/local/path
make ENABLE_OPTIMIZED=1
I've used the ENABLE_OPTIMIZED=1 option during make because if not I obtain the message:
llvm[0]: ***** Note: Debug build can be 10 times slower than an
llvm[0]: ***** optimized build. Use make ENABLE_OPTIMIZED=1 to
llvm[0]: ***** make an optimized build. Alternatively you can
llvm[0]: ***** configure with --enable-optimized.
and debug build is by default, but I not need a debug build
Is normal this big size?
My hardware is an Intel Core i5-2500 running Debian
The compiler was clang 3.2 from the Debian repositories (gcc is installed but the configure selects automatically clang)
Hi
I have successfully cross compiled libiomp5.so for ARM using clang and arm-linux-gnueabihf toolchain. Then I cross compiled a simple openmp program with libiomp5.so. But when I tried to run the program on ARM, I got a problem. The program is:
int main(int argc, char** argv)
{
int i = 0;
for(i = 0; i < 10; i++) {
int tid = omp_get_thread_num();
printf("hello from thread %d\n", tid);
}
return 0;
}
The output is:
hello from thread 0
hello from thread 0
hello from thread 2
hello from thread 2
hello from thread 1
hello from thread 1
hello from thread 3
hello from thread 3
hello from thread 4
hello from thread 4
pthread_mutex_lock.c:80: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
Aborted
Can someone provide some help?
Thanks!
Xiaokang
Is there a canonical location to submit bug reports or patches for the Intel OpenMP library? I wanted to use the Clang OpenMP implementation and found several bugs/issues that I would like to see fixed in the libiomp5 open source version.
How to get around the problems "icc:command not found" and "icpc:command not found" in ubuntu 12.04 ?
I can't find it in these repos.
I'm using Ubuntu 14.04, here's what locate gives me:
$ locate omp.h
/opt/lib/osl-1.4.0/include/OSL/oslcomp.h
/usr/include/re_comp.h
/usr/include/linux/ppp-comp.h
/usr/include/linux/seccomp.h
/usr/include/net/ppp-comp.h
/usr/include/openssl/comp.h
/usr/lib/gcc/i686-w64-mingw32/4.8/include/omp.h
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/omp.h
/usr/lib/gcc/x86_64-w64-mingw32/4.8/include/omp.h
/usr/lib/perl/5.18.2/CORE/regcomp.h
/usr/src/linux-headers-3.13.0-24/arch/...
When I'm trying to use gcc's one, it throws errors like that:
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/xmmintrin.h:101:19: error: use of undeclared identifier '__builtin_ia32_addss'
return (__m128) __builtin_ia32_addss ((__v4sf)__A, (__v4sf)__B);
^
Hi
I'm working with openmp in Android/ARM with gcc but now I wanted to work with openmp in clang(llvm), is it possible?
To work with gcc I used NDK r9, someone knows how can I reach it with Clang-omp?
Thanks!!
Juan
Hi all,
I am wondering if there is any document that describes openmp related variables in IR? I found it a little difficult to understand all of them. A document would be really helpful. Thanks in advance.
I cannot compile the project.
The compilation produces following errors:
~/llvm/projects/compiler-rt/lib/asan/asan_malloc_mac.cc:18:10: fatal error: 'AvailabilityMacros.h'
file not found
#include <AvailabilityMacros.h>
^
~/llvm/projects/compiler-rt/lib/asan/asan_mac.cc:26:10: fatal error: 'crt_externs.h' file not found
#include <crt_externs.h> // for _NSGetArgv
^
1 error generated.
make[5]: *** [~/llvm-build/tools/clang/runtime/compiler-rt/clang_darwin/asan_osx/i386/SubDir.lib__asan/asan_mac.o] Error 1
make[5]: *** Waiting for unfinished jobs....
1 error generated.
make[5]: *** [~/llvm-build/tools/clang/runtime/compiler-rt/clang_darwin/asan_osx/i386/SubDir.lib__asan/asan_malloc_mac.o] Error 1
make[4]: *** [BuildRuntimeLibraries] Error 2
rm ~/llvm-build/Release/lib/clang/3.3.1/lib/darwin/.dir
make[3]: *** [compiler-rt/.makeall] Error 2
make[2]: *** [all] Error 1
make[1]: *** [clang/.makeall] Error 2
make: *** [all] Error 1
But the missing files are present, somewhere deep inside my 'Xcode 5.0.2' installation.
(/Applications/Xcode.app/....../include/).
I get an issue building ceres-solver (http://homes.cs.washington.edu/~sagarwal/ceres-solver/stable/) with clang-omp. Note that I had to change the CMakeLists to enable using openmp with Clang, but it was a minor change.:
internal/ceres/schur_eliminator_impl.h:186:5: error: unable to calculate number of iterations of the for-loop
for (int i = num_eliminate_blocks_; i < num_col_blocks; ++i) {
I have installed clang 3.5,but no openmp runtime .Do i have to build openmp runtime if i just want to see omp statement in the ast?I used the command"clang -Xclang -ast-dump -fopenmp sourcefile.c",but failed.What should i do ?Thank you.
template<typename T>
int foo()
{
typename T::data_t value;
#pragma omp parallel for private(value)
for (int i = 0; i < 5; i += 2)
{
}
}
$ clang -fopenmp -c test.cpp
clang: lib/Sema/SemaInit.cpp:5082: ExprResult clang::InitializationSequence::Perform(clang::Sema &, const clang::InitializedEntity &, const clang::InitializationKind &, MultiExprArg, clang::QualType *): Assertion `Kind.getKind() == InitializationKind::IK_Copy || Kind.isExplicitCast() || Kind.getKind() == InitializationKind::IK_DirectList' failed.
7 clang 0x0000000000baef4f clang::InitializationSequence::Perform(clang::Sema&, clang::InitializedEntity const&, clang::InitializationKind const&, llvm::MutableArrayRef<clang::Expr*>, clang::QualType*) + 303
8 clang 0x0000000000bfa9d9 clang::Sema::ActOnOpenMPPrivateClause(llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, clang::SourceLocation) + 2201
There are three kmpc querries (kmpc_bound_thread_num, kmpc_bound_num_threads, and kmpc_in_parallel) that refer to the innermost active parallel construct.
I just want to make sure that "active" is really ment as defined in the OMP specs.
For example, the traditional omp_get_num_threads gets the number of threads in the current region. But the kmpt_bound_num_threads request this info in the innermost active parallel construct.
They differ in this context:
{
// code A
#pramga omp parallel num_thread(1)
{
// code B
}
}
In Code A... both calls return the same number 4 as the parallel region is active (i.e. more than one thread).
In Code B, however, the innermost parallel region is not active. So the omp calls return 1, and the kmpc calls return 4 (as the outer parallel region is the innermost active region).
Is this understanding correct? If so, what is the default answer when there are no active parallel region?
Thanks
Alexandre
When you move to a new LLVM release, could you tag the last version that works with the previous one?
I want to stick with LLVM 3.4 for now but can't see which clang-omp version to download that still works with it. The latest one doesn't. Which commit ID is the last one for 3.4?
Thanks, Gero.
I'm interested in contributing to this project, though the documentation and open issues about where and how to get started are quite minimal. In particular, I would like to help develop new optimization passes or improve the existing Clang support for OpenMP. If you could provide some information about how to accomplish either one of these tasks (e.g. where in the code to look), that would be great!
The llvm and clang sources on the clang-omp git don't include polly, and polly can't be added because the version on git is out of sync with the clang-omp sources.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.