juliagpu / cudaapi.jl Goto Github PK
View Code? Open in Web Editor NEWReusable components for CUDA API development.
Home Page: https://juliagpu.org/cuda/
License: Other
Reusable components for CUDA API development.
Home Page: https://juliagpu.org/cuda/
License: Other
Couple of things we can do to improve reliability.
We can install CUDA in a sudo vm, so we should be able to test all but driver detection.
ref travis-ci/apt-package-safelist#587 travis-ci/travis-ci#5911
AppVeyor has Visual Studio installed, and offers multiple build images (2013, 2015, 2017) so that should be great for monitoring that logic.
Similar to Travis CI with CUDA, we can do the toolkit installation and testing.
Ref https://insight.io/github.com/caffe2/caffe2/blob/beed9061489d91c46b2e1eb9347c1505ff0e7f92/scripts/appveyor/install_cuda.bat https://github.com/willyd/appveyor-cuda-test/blob/master/build.cmd
On Fedora 27 using the negativo17 driver, find_host_compiler()
fails. The system compiler is GCC 7.3.1, but the repo offers a compiler named cuda-gcc
for compatibility with the cuda toolkit.
julia> using CUDAapi
julia> find_host_compiler()
("/usr/bin/gcc", v"7.3.1")
julia> find_host_compiler(v"9.1")
ERROR: Could not find a suitable GCC
Stacktrace:
[1] find_host_compiler(::VersionNumber) at /home/xxxxxx/.julia/v0.6/CUDAapi/src/discovery.jl:361
No problems with up to v1.2.0 but in case it hasn't been flagged/noticed yet...
Running CUDAapi under Julia v1.3.0-rc1 fails on precompilation due to some syntax issue during CUDA discovery, throwing the following:
[ Info: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
ERROR: LoadError: LoadError: syntax: suffix not allowed after `var"
push!(msvc_paths, msvc_path)
end
end
end
## look in PATH as well
let msvc_path = Sys.which("`
Stacktrace:
[1] top-level scope at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/discovery.jl:504
[2] include at ./boot.jl:328 [inlined]
[3] include_relative(::Module, ::String) at ./loading.jl:1105
[4] include at ./Base.jl:31 [inlined]
[5] include(::String) at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:1
[6] top-level scope at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:14
[7] include at ./boot.jl:328 [inlined]
[8] include_relative(::Module, ::String) at ./loading.jl:1105
[9] include(::Module, ::String) at ./Base.jl:31
[10] top-level scope at none:2
[11] eval at ./boot.jl:330 [inlined]
[12] eval(::Expr) at ./client.jl:433
[13] top-level scope at ./none:3
in expression starting at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/discovery.jl:504
in expression starting at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:14
ERROR: Failed to precompile CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3] to /home/[email protected]/.julia/compiled/v1.3/CUDAapi/c7oFM_T78xm.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1274
[3] _require(::Base.PkgId) at ./loading.jl:1024
[4] require(::Base.PkgId) at ./loading.jl:922
[5] require(::Module, ::Symbol) at ./loading.jl:917
I tried to install Knet on a Ubuntu 18.04 server and ran into this issue, which I now think is caused by an error of CUDAapi to find a suitable GCC
julia> using CUDAapi
julia> tk = CUDAapi.find_toolkit()
1-element Array{String,1}:
"/usr/local/cuda-9.0"
julia> tc = CUDAapi.find_toolchain(tk)
ERROR: Could not find a suitable GCC
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] macro expansion at ./logging.jl:313 [inlined]
[3] find_host_compiler(::VersionNumber) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:359
[4] find_toolchain(::Array{String,1}, ::VersionNumber) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:494
[5] find_toolchain(::Array{String,1}) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:487
[6] top-level scope at none:0
julia> CUDAapi.find_host_compiler()
("/usr/bin/gcc", v"7.3.0")
(v0.7) pkg> status
Status `~/.julia/environments/v0.7/Project.toml`
[3895d2a7] CUDAapi v0.5.0+ #master (https://github.com/JuliaGPU/CUDAapi.jl.git)
julia> versioninfo()
Julia Version 0.7.0
Commit a4cb80f3ed (2018-08-08 06:46 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, haswell)
In the shell I get
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
and
gcc --version
gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Could there be a function telling whether the GPU supports Float64 computation?
I noticed some weird lib name when install CUDA on imac
Actually, I have nvToolsExt but it's a different name
-r-xr-xr-x 1 imac staff 42704 10 10 14:47 libnvToolsExt.1.dylib
lrwxr-xr-x 1 imac staff 21 10 10 14:47 libnvToolsExt.dylib -> libnvToolsExt.1.dylib
libnvToolsExt.1.dylib <==> libnvToolsExt.dylib.1.0 is different
I don't know why
ENV:
macOS High Sierra
10.13.6
CUDA:
/usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:14:47_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
https://github.com/JuliaGPU/CUDAapi.jl/blob/master/src/discovery.jl#L77
could just use the word_size
argument.
Alternatively the word_size
argument could be dropped from that function (doesn't seem to be really used at all.
Testing on CUDA 9.0.176 and cl.exe 19.0.24210 I get a compiler incompatibility error. These are compatible.
Therefore, it still appears in the tab-completion list:
julia> using CUDAapi
julia> find_<tab><tab>
find_cuda_binary find_host_compiler find_libdevice find_toolkit
find_cuda_library find_libcudadevrt find_toolchain find_toolkit_version
julia> find_host_compiler
ERROR: UndefVarError: find_host_compiler not defined
I have a Chinese version of Visual Studio installed, and CUDAapi.jl cannot recognize it:
julia> using CUDAapi
INFO: Recompiling stale cache file C:\Users\ylxdzsw\.julia\lib\v0.6\CUDAapi.ji for module CUDAapi.
julia> CUDAapi.find_host_compiler()
ERROR: MethodError: no method matching getindex(::Void, ::Int64)
Stacktrace:
[1] find_host_compiler(::Void) at C:\Users\ylxdzsw\.julia\v0.6\CUDAapi\src\discovery.jl:409
[2] find_host_compiler() at C:\Users\ylxdzsw\.julia\v0.6\CUDAapi\src\discovery.jl:322
I think the problem is in the reg match, where it searches for "Version"
# find MSVC versions
msvc_list = Dict{VersionNumber,String}()
for path in msvc_paths
tmpfile = tempname() # TODO: do this with a pipe
if !success(pipeline(`$path`, stdout=DevNull, stderr=tmpfile))
warn("Could not execute $path")
continue
end
ver_str = match(r"Version\s+(\d+(\.\d+)?(\.\d+)?)"i, read(tmpfile, String))[1]
ver = VersionNumber(ver_str)
msvc_list[ver] = path
end
but my cl.exe
outputs something funnier:
$ cl.exe
用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.14.26431 版
版权所有(C) Microsoft Corporation。保留所有权利。
用法: cl [ 选项... ] 文件名... [ /link 链接选项... ]
I would suggest use r"(\d+(\.\d+)(\.\d+)?)"i
instead. Not sure how robust will this be.
I was getting some weird test errors in CuArrays. I think something in this package was causing the problem, because my ext.jl contained
const libcufft = "/usr/local/cuda-10.1/lib64/libcufft.so"
const libcublas = "/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so"
const configured = true
const libcusparse = "/usr/local/cuda-10.1/lib64/libcusparse.so"
const libcusolver = "/usr/local/cuda-10.1/lib64/libcusolver.so"
const libcurand = "/usr/local/cuda-10.1/lib64/libcurand.so"
const libcudnn = "/usr/local/cuda-8.0/cudnn/lib64/libcudnn.so
So it was using a mix of the 10.1 and the 8.0 installs I had made. Eventually I figured out that libcudnn
and libcublas
had moved to /usr/lib/x86_64-linux-gnu/libcublas.so
and /usr/lib/x86_64-linux-gnu/libcudnn.so
. After I added symlinks the ext.jl file was pointing at /usr/local/cuda-10.1
strictly, and tests for CuArrays
passed. So I think something here should change so that /usr/lib
gets checked for cublas and cudnn first. Or something like that. Or if not, it would be useful to have adding symlinks listed somewhere as a debugging step.
I got this error when trying to build CUDAnative
.
Building LLVM ──────→ `~/.julia/packages/LLVM/FAUY/deps/build.log`
Building CUDAdrv ───→ `~/.julia/packages/CUDAdrv/GyXD/deps/build.log`
Building CUDAnative → `~/.julia/packages/CUDAnative/mXUk/deps/build.log`
┌ Error: Error building `CUDAnative`:
│ ┌ Debug: Looking for CUDA toolkit via environment variables
│ │ CUDA_HOME = "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Request to look for binary nvcc
│ │ locations =
│ │ 1-element Array{String,1}:
│ │ "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Looking for binary nvcc
│ │ locations =
│ │ 25-element Array{String,1}:
│ │ "/sw/software/cuda/9.1/centos7.3_binary/"
│ │ "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │ "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │ "/usr/local/bin"
│ │ ⋮
│ │ "/d/home/xiaoqihu/cuda/bin"
│ │ "/d/home/xiaoqihu/bin"
│ │ "/d/home/xiaoqihu/cuda/bin"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Found binary nvcc at /sw/software/cuda/9.1/centos7.3_binary
│ └ @ CUDAapi discovery.jl:126
│ ERROR: LoadError: could not spawn `/sw/software/cuda/9.1/centos7.3_binary/nvcc --version`: no such file or directory (ENOENT)
│ Stacktrace:
│ [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:370
│ [2] (::getfield(Base, Symbol("##495#496")){Cmd})(::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:512
│ [3] setup_stdio(::getfield(Base, Symbol("##495#496")){Cmd}, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:493
│ [4] #_spawn#494(::Nothing, ::Function, ::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:511
│ [5] _spawn(::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:507
│ [6] #open#504(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNullStream) at ./process.jl:601
│ [7] open at ./process.jl:591 [inlined]
│ [8] open(::Cmd, ::String, ::Base.DevNullStream) at ./process.jl:572
│ [9] read(::Cmd) at ./process.jl:646
│ [10] read(::Cmd, ::Type{String}) at ./process.jl:652
│ [11] find_toolkit_version(::Array{String,1}) at /d/home/xiaoqihu/.julia/packages/CUDAapi/g08Z/src/discovery.jl:259
│ [12] main() at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:114
│ [13] top-level scope at none:0
│ [14] include at ./boot.jl:317 [inlined]
│ [15] include_relative(::Module, ::String) at ./loading.jl:1075
│ [16] include(::Module, ::String) at ./sysimg.jl:29
│ [17] include(::String) at ./client.jl:393
│ [18] top-level scope at none:0
│ in expression starting at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:156
└ @ Pkg.Operations Operations.jl:973
Here is my version info, I am using 0.7-beta, but I got the same error using 0.6.2:
julia> versioninfo()
Julia Version 0.7.0-beta.0
Commit f41b1ecaec (2018-06-24 01:32 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
JULIA_DEBUG = CUDAapi
I'm having problems reinstalling Knet after a fresh install on Julia 1.4 and checked again the installation instructions. I checked my cuda libraries as usual, but find_cuda_library
doesn't return anything.
tk=find_toolkit()
1-element Array{String,1}:
"C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.1"
find_cuda_library("cudnn",tk)
I double checked my Path under Environment variables as well as the toolkit folders: the dlls are there: cudnn64_7.dll, cudart64_101.dll and cudart32_101.dll among others.
cc @denizyuret
The following call in discovery.jl is no longer supported:
L438: clang_path = find_binary("clang")
It should be replaced with
clang_path = find_binary(["clang"])
When we warn about multiple toolkits, we should tell the user about CUDA_HOME etc to select a toolkit.
(Sorry, for the spams. I meant to post #95...)
As discussed here, CUDAapi was unable to find libcudnn.so
. Attached is the output of the build log for CuArrays with JULIA_DEBUG=CUDAapi
in the environment.
cat ~/.julia/packages/CuArrays/f4Eke/deps/build.log
\u250c Debug: Request to look for binary nvcc
\u2502 locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for binary nvcc
\u2502 locations =
\u2502 10-element Array{String,1}:
\u2502 "/home/jacobr/code/julia-1.0.3/bin"
\u2502 "/home/jacobr/code/cmake/bin"
\u2502 "/home/jacobr/miniconda3/bin"
\u2502 "/home/jacobr/code/cmake/bin"
\u2502 \u22ee
\u2502 "/usr/local/sbin"
\u2502 "/usr/sbin"
\u2502 "/home/jacobr/bin"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Request to look for library cudart
\u2502 locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcudart
\u2502 locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library libcudart at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudart.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Looking for CUDA toolkit via CUDA runtime library
\u2502 path = "/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudart.so"
\u2502 dir = "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for CUDA toolkit via default installation directories
\u2502 dirs =
\u2502 1-element Array{Any,1}:
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found CUDA toolkit at /usr/local/cuda-9.0/targets/x86_64-linux, /usr/local/cuda-9.0
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:260
\u250c Debug: Request to look for library cublas
\u2502 locations =
\u2502 2-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcublas
\u2502 locations =
\u2502 6-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502 "/usr/local/cuda-9.0"
\u2502 "/usr/local/cuda-9.0/lib"
\u2502 "/usr/local/cuda-9.0/lib64"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cusolver
\u2502 locations =
\u2502 2-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcusolver
\u2502 locations =
\u2502 6-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502 "/usr/local/cuda-9.0"
\u2502 "/usr/local/cuda-9.0/lib"
\u2502 "/usr/local/cuda-9.0/lib64"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcusolver at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcusolver.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cufft
\u2502 locations =
\u2502 2-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcufft
\u2502 locations =
\u2502 6-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502 "/usr/local/cuda-9.0"
\u2502 "/usr/local/cuda-9.0/lib"
\u2502 "/usr/local/cuda-9.0/lib64"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcufft at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcufft.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library curand
\u2502 locations =
\u2502 2-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcurand
\u2502 locations =
\u2502 6-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502 "/usr/local/cuda-9.0"
\u2502 "/usr/local/cuda-9.0/lib"
\u2502 "/usr/local/cuda-9.0/lib64"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcurand at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcurand.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cudnn
\u2502 locations =
\u2502 2-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcudnn
\u2502 locations =
\u2502 6-element Array{String,1}:
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib"
\u2502 "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502 "/usr/local/cuda-9.0"
\u2502 "/usr/local/cuda-9.0/lib"
\u2502 "/usr/local/cuda-9.0/lib64"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Warning: could not find cudnn, its functionality will be unavailable
\u2514 @ Main ~/.julia/packages/CuArrays/f4Eke/deps/build.jl:29
I tried to add the location of the libcudnn.so to the LD_LIBRARY_PATH
(as per this issue) (see code block below).
04:40:22 ~$ env | grep -i LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64
04:40:28 ~$ ls /usr/lib64 | grep -i cudnn
libcudnn.so.7
libcudnn.so.7.2.1
But this results in the same build output as when LD_LIBRARY_PATH
is not set.
Please let me know if you need additional information or for me to test anything, I'll be happy to help.
I had the same problem as #85:
julia> using CUDAapi
julia> CUDAapi.find_toolkit()
6-element Array{String,1}:
"/usr/local/cuda-10.0/targets/x86_64-linux"
"/usr/local/cuda-8.0"
"/usr/local/cuda-9.0"
"/usr/local/cuda-9.1"
"/usr/local/cuda-10.0"
"/usr/local/cuda-10.1"
julia> ENV["CUDA_HOME"] = "/usr/local/cuda-10.1"
"/usr/local/cuda-10.1"
julia> CUDAapi.find_toolkit()
1-element Array{String,1}:
"/usr/local/cuda-10.1"
julia> using CUDAnative
julia> using CuArrays
Although I could workaround this by setting CUDA_HOME
, it would be nice if CUDAapi.find_toolkit
is a bit wise so that it returns the versions that are supported by CUDAnative
first.
Tested this on:
Not sure about the interface change. CUDAdrv etc will all be broken. Since this package is called CUDAapi, what is the point of adding cuda to function names?
discovery.jl fixes that I needed to get it working:
"$(name)$(word_size)",name])
-- otherwise find_cuda_driver does not work, which lives in Windows/System32/nvcuda.dll
.all_names = sort(unique(all_names), rev=true)
-- otherwise when multiple versions are present, we don't get the latest. program_files = get(ENV, Sys.WORD_SIZE == 64 ? "ProgramFiles(x86)" : "ProgramFiles",
Sys.WORD_SIZE == 64 ? "C:\\Program Files (x86)" : "C:\\Program Files")
isempty(msvc_paths) && error("No Visual Studio installation found")
, I think this one is just a typo.Latest Amazon AWS machine image: ami-263e1643, Knet171216win at Ohio (us-east-2) running on a p2.xlarge instance.
Longer term how should this library be used for other packages in the CUDA ecosystem ?
We currently collect a set of candidate toolkit directories, and filter based on the existence of that directory. This isn't valid, on some systems discovered toolkit directories don't actually contain the necessary tools.
We should filter better: check for existence of files and tools, eg. nvcc
, libdevice
and libcudart
. However, how to deal with systems that spread files all over the place? IIRC, Debian places tools like nvcc
in /usr/bin
, but libdevice
is still in /usr/share
. Maybe we should return a list of directories that contain relevant files, but then we might pick up multiple, conflicting toolkits.
cc @cfoket
If I have CUDA_PATH
in my environment but set as an empty variable, CUDA discovery will fail.
$ env | grep CUDA
CUDA_PATH=
In this case, find_toolkit()
will return an empty list. This will then fail in find_toolkit_version()
when building the CUDAnative package:
┌ Error: Error building `CUDAnative`:
│ ERROR: LoadError: CUDA toolkit at doesn't contain nvcc
The problem is that the code that checks if these env vars exist will return immediately even if the env vars are empty:
https://github.com/JuliaGPU/CUDAapi.jl/blob/v1.0.1/src/discovery.jl#L230-L241
So this code block should check if the environment variables have a valid value, and if not, pass on to the remainder of the function to check the nvcc path and the default system paths.
On my Fedora 26 and 27 systems, find_host_compiler
fails. Having ccache
installed seems to be related.
$ ./julia
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: https://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.6.4 (2018-07-09 19:09 UTC)
_/ |\__'_|_|_|\__'_| |
|__/ | x86_64-redhat-linux
julia> using CUDAapi
julia> Pkg.status("CUDAapi")
- CUDAapi 0.4.3
julia> find_host_compiler()
WARNING: Could not parse GCC version info ("ccache version 3.3.6"), skipping this compiler.
ERROR: Could not find a suitable GCC
Stacktrace:
[1] find_host_compiler(::Void) at /home/rick/.julia/v0.6/CUDAapi/src/discovery.jl:361
[2] find_host_compiler() at /home/rick/.julia/v0.6/CUDAapi/src/discovery.jl:322
julia> find_binary(["gcc"])
"/usr/lib64/ccache/../../bin/ccache"
julia> find_binary(["gcc"], locations=["/bin"])
"/bin/gcc"
The tag name "v0.1" is not of the appropriate SemVer form (vX.Y.Z).
cc: @maleadt
Maybe there's something wrong with how CI (hackishly) installs CUDA, but it looks like CUDA 9.1 isn't detected properly.
It can't find GCC for some reason, even though it is in the path.
julia> Pkg.test("CUDAapi")
INFO: Testing CUDAapi
ERROR: LoadError: Could not find a suitable GCC
Stacktrace:
[1] find_host_compiler(::VersionNumber) at /home/dss/.julia/v0.6/CUDAapi/src/discovery.jl:365
[2] include_from_node1(::String) at ./loading.jl:576
[3] include(::String) at ./sysimg.jl:14
[4] process_options(::Base.JLOptions) at ./client.jl:305
[5] _start() at ./client.jl:371
while loading /home/dss/.julia/v0.6/CUDAapi/test/runtests.jl, in expression starting on line 32
==========================================================[ ERROR: CUDAapi ]==========================================================
failed process: Process(`/home/dss/Downloads/julia-d386e40c17/bin/julia -Cx86-64 -J/home/dss/Downloads/julia-d386e40c17/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/dss/.julia/v0.6/CUDAapi/test/runtests.jl`, ProcessExited(1)) [1]
======================================================================================================================================
ERROR: CUDAapi had test errors
shell> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
shell> nvidia-smi | grep Version
| NVIDIA-SMI 390.30 Driver Version: 390.30 |
shell> gcc --version
gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
CUDAnative currently warns if the CUDA version as reported by the driver does not match what we get from NVCC. This is often a false positive, but these incompatibilities do apparently matter (until today, I hadn't run into issues with only CUDAnative). For example, cuBLAS fails to initialize if the driver is too old for the current toolkit:
$ nvidia-smi | grep Version
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
$ julia -e "using CUDAdrv; @show CUDAdrv.version()"
CUDAdrv.version() = v"9.0.0"
$ julia -e 'println(ccall((:cublasCreate_v2, "/opt/cuda-9.0/lib64/libcublas.so"), Cuint, (Ptr{Ptr{Void}},), Ref{Ptr{Void}}()))'
0
$ julia -e 'println(ccall((:cublasCreate_v2, "/opt/cuda-9.1/lib64/libcublas.so"), Cuint, (Ptr{Ptr{Void}},), Ref{Ptr{Void}}()))'
1
As far as I know, these requirements aren't well documented. This post has a list:
CUDA 9.1: 387.xx
CUDA 9.0: 384.xx
CUDA 8.0 375.xx (GA2)
CUDA 8.0: 367.4x
CUDA 7.5: 352.xx
CUDA 7.0: 346.xx
CUDA 6.5: 340.xx
CUDA 6.0: 331.xx
CUDA 5.5: 319.xx
CUDA 5.0: 304.xx
CUDA 4.2: 295.41
CUDA 4.1: 285.05.33
CUDA 4.0: 270.41.19
CUDA 3.2: 260.19.26
CUDA 3.1: 256.40
CUDA 3.0: 195.36.15
But maybe we can just trust what the driver reports, but instead of warning when versions don't match, we should check whether the version is equal or higher than the toolkit version as reported by nvcc
?
I'm using a machine without sudo access, so I've installed cuDNN via anaconda (which doesn't require sudo). CUDAdrv
and CuArrays
detect the GPU and run, however CUDAapi
doesn't detect cuDNN
so prevents Knet
from using GPU
> using CUDAapi
> @show find_cuda_library("cudnn")
find_cuda_library("cudnn") = nothing
> using Knet; include(Knet.dir("test/gpu.jl")
...
cannot find cudnn
The error in discovery.jl:362 uses the undefined variable msvc_ver.
The CUDA tool kit and cudnn were installed done! (The ENV variable was setting done!)
But the Fllux initialized or test(CUDAdrv, CUDAnative, CuArrays) are still can not found cudnn64_7.dll
using CUDAdrv, CUDAnative, CuArrays
Pkg.test(["CUDAdrv", "CUDAnative", "CuArrays"])
[ Info: Testing using device GeForce GTX 1650 SUPER (compute capability 7.5.0, 3.233 GiB available memory) on CUDA driver 11.0.0 and toolkit 10.2.89
multi dim, sliced setindex: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:51
Got exception outside of a @test
could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
The specified module could not be found.
Stacktrace:
[1] #dlopen#3(::Bool, ::typeof(Libdl.dlopen), ::String, ::UInt32) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109
[2] dlopen at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109 [inlined] (repeats 2 times)
[3] use_artifact_cudnn(::VersionNumber) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:187
[4] use_artifact_cuda() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:131
[5] __configure_dependencies__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:235
[6] __configure__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:117
[7] (::CuArrays.var"#1#2"{Bool})() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:45
[8] lock(::CuArrays.var"#1#2"{Bool}, ::ReentrantLock) at .\lock.jl:151
[9] _functional(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:43
[10] functional at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:36 [inlined]
[11] macro expansion at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:63 [inlined]
[12] libcurand at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:37 [inlined]
[13] (::CuArrays.CURAND.var"#13122#cache_fptr!#11")() at C:\Users\USER\.julia\packages\CUDAapi\XuSHC\src\call.jl:31
[14] curandCreateGenerator(::Base.RefValue{Ptr{Nothing}}, ::CuArrays.CURAND.curandRngType) at C:\Users\USER\.julia\packages\CUDAapi\XuSHC\src\call.jl:39
[15] CuArrays.CURAND.RNG(::CuArrays.CURAND.curandRngType) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:23
[16] RNG at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:22 [inlined]
[17] #123 at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\CURAND.jl:34 [inlined]
[18] get!(::CuArrays.CURAND.var"#123#124", ::IdDict{Any,Any}, ::Any) at .\abstractdict.jl:661
[19] generator() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\CURAND.jl:33
[20] rand!(::CuArray{Float32,4,Nothing}) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:166
[21] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:54 [inlined]
[22] macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Test\src\Test.jl:1107 [inlined]
[23] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:52 [inlined]
[24] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\src\host\indexing.jl:63 [inlined]
[25] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:41 [inlined]
[26] macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Test\src\Test.jl:1107 [inlined]
[27] test_indexing(::Type) at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:3
[28] test(::Type{CuArray}) at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite.jl:51
[29] top-level scope at C:\Users\USER\.julia\packages\CuArrays\e8PLr\test\runtests.jl:49
[30] top-level scope at
.
.
.
.
.
.
copyto! for triangular: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\linalg.jl:14
Got exception outside of a @test
could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
The specified module could not be found.
Stacktrace:
[1] #dlopen#3(::Bool, ::typeof(Libdl.dlopen), ::String, ::UInt32) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109
[2] dlopen at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109 [inlined] (repeats 2 times)
[3] use_artifact_cudnn(::VersionNumber) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:187
[4] use_artifact_cuda() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:131
[5] __configure_dependencies__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:235
[6] __configure__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:117
[7] (::CuArrays.var"#1#2"{Bool})() at
Float32 gemm C := adjoint(A) * transpose(B) * a + C * b: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\linalg.jl:83
Test threw exception
Expression: compare(mul!, AT, C, f(A), g(B), Ref(T(4)), Ref(T(5)))
could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
The specified module could not be found.
My Pkg.status():
[fbb218c0] BSON v0.2.5
[336ed68f] CSV v0.6.0
[c5f51814] CUDAdrv v6.2.2
[be33ccc6] CUDAnative v3.0.4
[3a865a2d] CuArrays v2.0.1
[1b08a953] Dash v0.1.0 #master (https://github.com/plotly/Dash.jl.git)
[a93c6f00] DataFrames v0.20.2
[1313f7d8] DataFramesMeta v0.5.0
[587475ba] Flux v0.10.4
[38e38edf] GLM v1.3.9
[cd3eb016] HTTP v0.8.13
[09f84164] HypothesisTests v0.9.2
[7073ff75] IJulia v1.21.1
[add582a8] MLJ v0.10.3
[91a5bcdd] Plots v0.29.9
My versioninfo():
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: AMD Ryzen 5 3600X 6-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, znver1)
On my machine could not identify CUDA libraries because of new CUDA toolkit version 10.1 absent in this list:
const cuda_versions = Dict(
"toolkit" => [v"1.0", v"1.1",
v"2.0", v"2.1", v"2.2",
v"3.0", v"3.1", v"3.2",
v"4.0", v"4.1", v"4.2",
v"5.0", v"5.5",
v"6.0", v"6.5",
v"7.0", v"7.5",
v"8.0",
v"9.0", v"9.1", v"9.2",
v"10.0"],
"cudnn" => [v"1.0",
v"2.0",
v"3.0",
v"4.0",
v"5.0", v"5.1",
v"6.0",
v"7.0", v"7.1", v"7.3", v"7.4"]
)
Maybe it will be simpler to autodetect library versions based on, e.g. folder name or from direct library call, instead of adding every new version to this list manually?
@c42f mentioned that our ccall wrapper may not be thread safe, ref. the latest example in https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong. Not easy to fix though, we absolutely don't want to use locks not to penalize the fast path, and I don't think Julia's atomics allow to CAS the Ref pointer (although we can probably hack around that).
Both the toolkit and msvs numbers
in the readme or via documenter
🍺 /usr/local/Cellar/gnu-tar/1.32: 15 files, 1.7MB, built in 1 minute 57 seconds
+sudo gtar -x --skip-old-files -f CUDAMacOSXInstaller/CUDAMacOSXInstaller.app/Contents/Resources/payload/cuda_mac_installer_tk.tar.gz -C /
gtar: CUDAMacOSXInstaller/CUDAMacOSXInstaller.app/Contents/Resources/payload/cuda_mac_installer_tk.tar.gz: Cannot open: No such file or directory
gtar: Error is not recoverable: exiting now
I've used CUDAapi
on JuliaPro 0.6.4.1
, and it seems not work perfectly.
julia> Pkg.test("CUDAapi")
INFO: Testing CUDAapi
ERROR: LoadError: MethodError: no method matching getindex(::Void, ::Int64)
Stacktrace:
[1] find_host_compiler(::Void) at E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\src\discovery.jl:407
[2] find_host_compiler() at E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\src\discovery.jl:321
[3] include_from_node1(::String) at .\loading.jl:576
[4] include(::String) at .\sysimg.jl:14
[5] process_options(::Base.JLOptions) at .\client.jl:305
[6] _start() at .\client.jl:371
while loading E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\test\runtests.jl, in expression starting on line 32
=========================================[ ERROR: CUDAapi ]=========================================
failed process: Process(`'E:\Develop\IDEs\JuliaPro\Julia-0.6.4\bin\julia.exe' -Cx86-64 '-JE:\Develop\IDEs\JuliaPro\Julia-0.6.4\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes 'E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\test\runtests.jl'`, ProcessExited(1)) [1]
====================================================================================================
ERROR: CUDAapi had test errors
BTW, I use the vcvars64.bat
provided by MS straightly on CMD command line
.
julia> ENV["PATH"]
E:\\Develop\\IDEs\\JuliaPro\\Julia-0.6.4\\bin;C:\\Program Files (x86)\\MSBuild\\14.0\\bin\\amd64;C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\amd64; ……
shell> cl test.cpp
……
test.cpp
Microsoft (R) Incremental Linker Version 14.00.24225.1
Copyright (C) Microsoft Corporation. All rights reserved.
/out:test.exe
test.obj
shell> test
julia>
I am getting method undefined errors on read(xxx,String) commands in find_host_compiler. Are you guys testing this on 0.6.1?
ERROR: MethodError: no method matching read(::Cmd, ::Type{String})
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.