Git Product home page Git Product logo

cudaapi.jl's People

Contributors

bors[bot] avatar femtocleaner[bot] avatar juliatagbot avatar kpamnany avatar maleadt avatar musm avatar qin-yu avatar una-dinosauria avatar vchuravy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cudaapi.jl's Issues

Improve CI

Couple of things we can do to improve reliability.

  1. Travis CI with CUDA

We can install CUDA in a sudo vm, so we should be able to test all but driver detection.

ref travis-ci/apt-package-safelist#587 travis-ci/travis-ci#5911

  1. AppVeyor CI with CUDA

AppVeyor has Visual Studio installed, and offers multiple build images (2013, 2015, 2017) so that should be great for monitoring that logic.

Similar to Travis CI with CUDA, we can do the toolkit installation and testing.

Ref https://insight.io/github.com/caffe2/caffe2/blob/beed9061489d91c46b2e1eb9347c1505ff0e7f92/scripts/appveyor/install_cuda.bat https://github.com/willyd/appveyor-cuda-test/blob/master/build.cmd

Failure to find host compiler on Fedora

On Fedora 27 using the negativo17 driver, find_host_compiler() fails. The system compiler is GCC 7.3.1, but the repo offers a compiler named cuda-gcc for compatibility with the cuda toolkit.

julia> using CUDAapi

julia> find_host_compiler()
("/usr/bin/gcc", v"7.3.1")

julia> find_host_compiler(v"9.1")
ERROR: Could not find a suitable GCC
Stacktrace:
 [1] find_host_compiler(::VersionNumber) at /home/xxxxxx/.julia/v0.6/CUDAapi/src/discovery.jl:361

CUDA discovery failing with 1.3.0-rc1

No problems with up to v1.2.0 but in case it hasn't been flagged/noticed yet...

Running CUDAapi under Julia v1.3.0-rc1 fails on precompilation due to some syntax issue during CUDA discovery, throwing the following:

[ Info: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
ERROR: LoadError: LoadError: syntax: suffix not allowed after `var"
                    push!(msvc_paths, msvc_path)
                end
            end
        end
        ## look in PATH as well
        let msvc_path = Sys.which("`
Stacktrace:
 [1] top-level scope at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/discovery.jl:504
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:1
 [6] top-level scope at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:14
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:433
 [13] top-level scope at ./none:3
in expression starting at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/discovery.jl:504
in expression starting at /home/[email protected]/.julia/packages/CUDAapi/NcPWp/src/CUDAapi.jl:14
ERROR: Failed to precompile CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3] to /home/[email protected]/.julia/compiled/v1.3/CUDAapi/c7oFM_T78xm.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1274
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917

ERROR: Could not find a suitable GCC

I tried to install Knet on a Ubuntu 18.04 server and ran into this issue, which I now think is caused by an error of CUDAapi to find a suitable GCC

julia> using CUDAapi

julia> tk = CUDAapi.find_toolkit()
1-element Array{String,1}:
 "/usr/local/cuda-9.0"

julia> tc = CUDAapi.find_toolchain(tk)
ERROR: Could not find a suitable GCC
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] macro expansion at ./logging.jl:313 [inlined]
 [3] find_host_compiler(::VersionNumber) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:359
 [4] find_toolchain(::Array{String,1}, ::VersionNumber) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:494
 [5] find_toolchain(::Array{String,1}) at /root/.julia/packages/CUDAapi/mUc5V/src/discovery.jl:487
 [6] top-level scope at none:0

julia> CUDAapi.find_host_compiler()
("/usr/bin/gcc", v"7.3.0")

(v0.7) pkg> status
    Status `~/.julia/environments/v0.7/Project.toml`
  [3895d2a7] CUDAapi v0.5.0+ #master (https://github.com/JuliaGPU/CUDAapi.jl.git)

julia> versioninfo()
Julia Version 0.7.0
Commit a4cb80f3ed (2018-08-08 06:46 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, haswell)

In the shell I get

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

and

gcc --version
gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Float64

Could there be a function telling whether the GPU supports Float64 computation?

Could not find nvToolsExt (libnvToolsExt.dylib.1.0 or libnvToolsExt.dylib.1) in /Users/imac/.julia/artifacts/b502baf54095dff4a69fd6aba8667124583f6929/lib

I noticed some weird lib name when install CUDA on imac

Actually, I have nvToolsExt but it's a different name

-r-xr-xr-x  1 imac  staff      42704 10 10 14:47 libnvToolsExt.1.dylib
lrwxr-xr-x  1 imac  staff         21 10 10 14:47 libnvToolsExt.dylib -> libnvToolsExt.1.dylib

libnvToolsExt.1.dylib <==> libnvToolsExt.dylib.1.0 is different
I don't know why

ENV:
macOS High Sierra
10.13.6

CUDA:
/usr/local/cuda/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:14:47_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

compatibility error

Testing on CUDA 9.0.176 and cl.exe 19.0.24210 I get a compiler incompatibility error. These are compatible.

find_host_compiler was removed, but is still exported

Therefore, it still appears in the tab-completion list:

julia> using CUDAapi

julia> find_<tab><tab>

find_cuda_binary     find_host_compiler    find_libdevice        find_toolkit
find_cuda_library    find_libcudadevrt     find_toolchain        find_toolkit_version
julia> find_host_compiler
ERROR: UndefVarError: find_host_compiler not defined

Support non-english installation?

I have a Chinese version of Visual Studio installed, and CUDAapi.jl cannot recognize it:

julia> using CUDAapi
INFO: Recompiling stale cache file C:\Users\ylxdzsw\.julia\lib\v0.6\CUDAapi.ji for module CUDAapi.

julia> CUDAapi.find_host_compiler()
ERROR: MethodError: no method matching getindex(::Void, ::Int64)
Stacktrace:
 [1] find_host_compiler(::Void) at C:\Users\ylxdzsw\.julia\v0.6\CUDAapi\src\discovery.jl:409
 [2] find_host_compiler() at C:\Users\ylxdzsw\.julia\v0.6\CUDAapi\src\discovery.jl:322

I think the problem is in the reg match, where it searches for "Version"

# find MSVC versions
msvc_list = Dict{VersionNumber,String}()
for path in msvc_paths
    tmpfile = tempname() # TODO: do this with a pipe
    if !success(pipeline(`$path`, stdout=DevNull, stderr=tmpfile))
        warn("Could not execute $path")
        continue
    end
    ver_str = match(r"Version\s+(\d+(\.\d+)?(\.\d+)?)"i, read(tmpfile, String))[1]
    ver = VersionNumber(ver_str)
    msvc_list[ver] = path
end

but my cl.exe outputs something funnier:

$ cl.exe
用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.14.26431 版
版权所有(C) Microsoft Corporation。保留所有权利。

用法: cl [ 选项... ] 文件名... [ /link 链接选项... ]

I would suggest use r"(\d+(\.\d+)(\.\d+)?)"i instead. Not sure how robust will this be.

Finding 8.0 files instead of 10.1 ones

I was getting some weird test errors in CuArrays. I think something in this package was causing the problem, because my ext.jl contained

const libcufft = "/usr/local/cuda-10.1/lib64/libcufft.so"
const libcublas = "/usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so"
const configured = true
const libcusparse = "/usr/local/cuda-10.1/lib64/libcusparse.so"
const libcusolver = "/usr/local/cuda-10.1/lib64/libcusolver.so"
const libcurand = "/usr/local/cuda-10.1/lib64/libcurand.so"
const libcudnn = "/usr/local/cuda-8.0/cudnn/lib64/libcudnn.so

So it was using a mix of the 10.1 and the 8.0 installs I had made. Eventually I figured out that libcudnn and libcublas had moved to /usr/lib/x86_64-linux-gnu/libcublas.so and /usr/lib/x86_64-linux-gnu/libcudnn.so. After I added symlinks the ext.jl file was pointing at /usr/local/cuda-10.1 strictly, and tests for CuArrays passed. So I think something here should change so that /usr/lib gets checked for cublas and cudnn first. Or something like that. Or if not, it would be useful to have adding symlinks listed somewhere as a debugging step.

CUDAapi cannot find CUDA binary location.

I got this error when trying to build CUDAnative.

  Building LLVM ──────→ `~/.julia/packages/LLVM/FAUY/deps/build.log`
  Building CUDAdrv ───→ `~/.julia/packages/CUDAdrv/GyXD/deps/build.log`
  Building CUDAnative → `~/.julia/packages/CUDAnative/mXUk/deps/build.log`
┌ Error: Error building `CUDAnative`:
│ ┌ Debug: Looking for CUDA toolkit via environment variables
│ │   CUDA_HOME = "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Request to look for binary nvcc
│ │   locations =
│ │    1-element Array{String,1}:
│ │     "/sw/software/cuda/9.1/centos7.3_binary/"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Looking for binary nvcc
│ │   locations =
│ │    25-element Array{String,1}:
│ │     "/sw/software/cuda/9.1/centos7.3_binary/"
│ │     "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │     "/sw/software/cuda/9.1/centos7.3_binary/bin"
│ │     "/usr/local/bin"
│ │     ⋮
│ │     "/d/home/xiaoqihu/cuda/bin"
│ │     "/d/home/xiaoqihu/bin"
│ │     "/d/home/xiaoqihu/cuda/bin"
│ └ @ CUDAapi CUDAapi.jl:15
│ ┌ Debug: Found binary nvcc at /sw/software/cuda/9.1/centos7.3_binary
│ └ @ CUDAapi discovery.jl:126
│ ERROR: LoadError: could not spawn `/sw/software/cuda/9.1/centos7.3_binary/nvcc --version`: no such file or directory (ENOENT)
│ Stacktrace:
│  [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:370
│  [2] (::getfield(Base, Symbol("##495#496")){Cmd})(::Tuple{Base.DevNullStream,Base.PipeEndpoint,RawFD}) at ./process.jl:512
│  [3] setup_stdio(::getfield(Base, Symbol("##495#496")){Cmd}, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:493
│  [4] #_spawn#494(::Nothing, ::Function, ::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:511
│  [5] _spawn(::Cmd, ::Tuple{Base.DevNullStream,Pipe,IOStream}) at ./process.jl:507
│  [6] #open#504(::Bool, ::Bool, ::Function, ::Cmd, ::Base.DevNullStream) at ./process.jl:601
│  [7] open at ./process.jl:591 [inlined]
│  [8] open(::Cmd, ::String, ::Base.DevNullStream) at ./process.jl:572
│  [9] read(::Cmd) at ./process.jl:646
│  [10] read(::Cmd, ::Type{String}) at ./process.jl:652
│  [11] find_toolkit_version(::Array{String,1}) at /d/home/xiaoqihu/.julia/packages/CUDAapi/g08Z/src/discovery.jl:259
│  [12] main() at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:114
│  [13] top-level scope at none:0
│  [14] include at ./boot.jl:317 [inlined]
│  [15] include_relative(::Module, ::String) at ./loading.jl:1075
│  [16] include(::Module, ::String) at ./sysimg.jl:29
│  [17] include(::String) at ./client.jl:393
│  [18] top-level scope at none:0
│ in expression starting at /d/home/xiaoqihu/.julia/packages/CUDAnative/mXUk/deps/build.jl:156
└ @ Pkg.Operations Operations.jl:973

Here is my version info, I am using 0.7-beta, but I got the same error using 0.6.2:

julia> versioninfo()
Julia Version 0.7.0-beta.0
Commit f41b1ecaec (2018-06-24 01:32 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_DEBUG = CUDAapi

find_cuda_library("cudnn",tk) returns nothing

I'm having problems reinstalling Knet after a fresh install on Julia 1.4 and checked again the installation instructions. I checked my cuda libraries as usual, but find_cuda_library doesn't return anything.

tk=find_toolkit()
1-element Array{String,1}:
 "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.1"

find_cuda_library("cudnn",tk)

I double checked my Path under Environment variables as well as the toolkit folders: the dlls are there: cudnn64_7.dll, cudart64_101.dll and cudart32_101.dll among others.

Find clang on OSX

The following call in discovery.jl is no longer supported:
L438: clang_path = find_binary("clang")

It should be replaced with
clang_path = find_binary(["clang"])

Unable to find cudnn

As discussed here, CUDAapi was unable to find libcudnn.so. Attached is the output of the build log for CuArrays with JULIA_DEBUG=CUDAapi in the environment.

cat ~/.julia/packages/CuArrays/f4Eke/deps/build.log 
\u250c Debug: Request to look for binary nvcc
\u2502   locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for binary nvcc
\u2502   locations =
\u2502    10-element Array{String,1}:
\u2502     "/home/jacobr/code/julia-1.0.3/bin"
\u2502     "/home/jacobr/code/cmake/bin"      
\u2502     "/home/jacobr/miniconda3/bin"      
\u2502     "/home/jacobr/code/cmake/bin"      
\u2502     \u22ee                                  
\u2502     "/usr/local/sbin"                  
\u2502     "/usr/sbin"                        
\u2502     "/home/jacobr/bin"                 
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Request to look for library cudart
\u2502   locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcudart
\u2502   locations = 0-element Array{String,1}
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library libcudart at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudart.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Looking for CUDA toolkit via CUDA runtime library
\u2502   path = "/usr/local/cuda-9.0/targets/x86_64-linux/lib/libcudart.so"
\u2502   dir = "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for CUDA toolkit via default installation directories
\u2502   dirs =
\u2502    1-element Array{Any,1}:
\u2502     "/usr/local/cuda-9.0"
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found CUDA toolkit at /usr/local/cuda-9.0/targets/x86_64-linux, /usr/local/cuda-9.0
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:260
\u250c Debug: Request to look for library cublas
\u2502   locations =
\u2502    2-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502     "/usr/local/cuda-9.0"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcublas
\u2502   locations =
\u2502    6-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"      
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib"  
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502     "/usr/local/cuda-9.0"                           
\u2502     "/usr/local/cuda-9.0/lib"                       
\u2502     "/usr/local/cuda-9.0/lib64"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcublas.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cusolver
\u2502   locations =
\u2502    2-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502     "/usr/local/cuda-9.0"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcusolver
\u2502   locations =
\u2502    6-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"      
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib"  
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502     "/usr/local/cuda-9.0"                           
\u2502     "/usr/local/cuda-9.0/lib"                       
\u2502     "/usr/local/cuda-9.0/lib64"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcusolver at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcusolver.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cufft
\u2502   locations =
\u2502    2-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502     "/usr/local/cuda-9.0"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcufft
\u2502   locations =
\u2502    6-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"      
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib"  
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502     "/usr/local/cuda-9.0"                           
\u2502     "/usr/local/cuda-9.0/lib"                       
\u2502     "/usr/local/cuda-9.0/lib64"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcufft at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcufft.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library curand
\u2502   locations =
\u2502    2-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502     "/usr/local/cuda-9.0"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcurand
\u2502   locations =
\u2502    6-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"      
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib"  
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502     "/usr/local/cuda-9.0"                           
\u2502     "/usr/local/cuda-9.0/lib"                       
\u2502     "/usr/local/cuda-9.0/lib64"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Found library /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcurand at /usr/local/cuda-9.0/targets/x86_64-linux/lib/libcurand.so
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/discovery.jl:82
\u250c Debug: Request to look for library cudnn
\u2502   locations =
\u2502    2-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"
\u2502     "/usr/local/cuda-9.0"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Debug: Looking for library libcudnn
\u2502   locations =
\u2502    6-element Array{String,1}:
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux"      
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib"  
\u2502     "/usr/local/cuda-9.0/targets/x86_64-linux/lib64"
\u2502     "/usr/local/cuda-9.0"                           
\u2502     "/usr/local/cuda-9.0/lib"                       
\u2502     "/usr/local/cuda-9.0/lib64"                     
\u2514 @ CUDAapi ~/.julia/packages/CUDAapi/ITC5q/src/CUDAapi.jl:8
\u250c Warning: could not find cudnn, its functionality will be unavailable
\u2514 @ Main ~/.julia/packages/CuArrays/f4Eke/deps/build.jl:29

I tried to add the location of the libcudnn.so to the LD_LIBRARY_PATH (as per this issue) (see code block below).

04:40:22 ~$ env | grep -i LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64
04:40:28 ~$ ls /usr/lib64 | grep -i cudnn
libcudnn.so.7
libcudnn.so.7.2.1

But this results in the same build output as when LD_LIBRARY_PATH is not set.

Please let me know if you need additional information or for me to test anything, I'll be happy to help.

Sort paths from CUDAapi.find_toolkit() in the descending order?

I had the same problem as #85:

julia> using CUDAapi

julia> CUDAapi.find_toolkit()
6-element Array{String,1}:
 "/usr/local/cuda-10.0/targets/x86_64-linux"
 "/usr/local/cuda-8.0"
 "/usr/local/cuda-9.0"
 "/usr/local/cuda-9.1"
 "/usr/local/cuda-10.0"
 "/usr/local/cuda-10.1"

julia> ENV["CUDA_HOME"] = "/usr/local/cuda-10.1"
"/usr/local/cuda-10.1"

julia> CUDAapi.find_toolkit()
1-element Array{String,1}:
 "/usr/local/cuda-10.1"

julia> using CUDAnative

julia> using CuArrays

Although I could workaround this by setting CUDA_HOME, it would be nice if CUDAapi.find_toolkit is a bit wise so that it returns the versions that are supported by CUDAnative first.

more windows fixes

Tested this on:

  • CUDA 9.0.176, cl.exe 19.00.24210 (VS 2015) ok
  • CUDA 9.1.85, cl.exe 19.00.24210 (VS 2015) ok
  • CUDA 9.1.85, cl.exe 19.12.25831 (VS 2017) could not test, CUDA 9.1 supports up to cl.exe 19.11
  • vswhere does not work with lightweight build tools, must install complete VS (or not rely on vswhere)

Not sure about the interface change. CUDAdrv etc will all be broken. Since this package is called CUDAapi, what is the point of adding cuda to function names?

discovery.jl fixes that I needed to get it working:

  • L29: needs: "$(name)$(word_size)",name]) -- otherwise find_cuda_driver does not work, which lives in Windows/System32/nvcuda.dll.
  • L34 needs: all_names = sort(unique(all_names), rev=true) -- otherwise when multiple versions are present, we don't get the latest.
  • L310 needs: (otherwise I get methoderror from get())
 program_files = get(ENV, Sys.WORD_SIZE == 64 ? "ProgramFiles(x86)" : "ProgramFiles",
                            Sys.WORD_SIZE == 64 ? "C:\\Program Files (x86)" : "C:\\Program Files")
  • L328 needs: isempty(msvc_paths) && error("No Visual Studio installation found"), I think this one is just a typo.

Latest Amazon AWS machine image: ami-263e1643, Knet171216win at Ohio (us-east-2) running on a p2.xlarge instance.

Wrong toolkit directory filtering

We currently collect a set of candidate toolkit directories, and filter based on the existence of that directory. This isn't valid, on some systems discovered toolkit directories don't actually contain the necessary tools.

We should filter better: check for existence of files and tools, eg. nvcc, libdevice and libcudart. However, how to deal with systems that spread files all over the place? IIRC, Debian places tools like nvcc in /usr/bin, but libdevice is still in /usr/share. Maybe we should return a list of directories that contain relevant files, but then we might pick up multiple, conflicting toolkits.

cc @cfoket

Discovery fails if CUDA env vars are set to empty variable

If I have CUDA_PATH in my environment but set as an empty variable, CUDA discovery will fail.

$ env | grep CUDA
CUDA_PATH=

In this case, find_toolkit() will return an empty list. This will then fail in find_toolkit_version() when building the CUDAnative package:

┌ Error: Error building `CUDAnative`:
│ ERROR: LoadError: CUDA toolkit at  doesn't contain nvcc

The problem is that the code that checks if these env vars exist will return immediately even if the env vars are empty:

https://github.com/JuliaGPU/CUDAapi.jl/blob/v1.0.1/src/discovery.jl#L230-L241

So this code block should check if the environment variables have a valid value, and if not, pass on to the remainder of the function to check the nvcc path and the default system paths.

Resolving symlink to ccache breaks version string parsing

On my Fedora 26 and 27 systems, find_host_compiler fails. Having ccache installed seems to be related.

$ ./julia 
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.4 (2018-07-09 19:09 UTC)
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |  x86_64-redhat-linux

julia> using CUDAapi

julia> Pkg.status("CUDAapi")
 - CUDAapi                       0.4.3

julia> find_host_compiler()
WARNING: Could not parse GCC version info ("ccache version 3.3.6"), skipping this compiler.
ERROR: Could not find a suitable GCC
Stacktrace:
 [1] find_host_compiler(::Void) at /home/rick/.julia/v0.6/CUDAapi/src/discovery.jl:361
 [2] find_host_compiler() at /home/rick/.julia/v0.6/CUDAapi/src/discovery.jl:322

julia> find_binary(["gcc"])
"/usr/lib64/ccache/../../bin/ccache"

julia> find_binary(["gcc"], locations=["/bin"])
"/bin/gcc"

Support CUDA 9.1 on Windows

Maybe there's something wrong with how CI (hackishly) installs CUDA, but it looks like CUDA 9.1 isn't detected properly.

Tests fail to find GCC on v0.6.2, CUDA 9.1.85, driver 390.30

It can't find GCC for some reason, even though it is in the path.

julia> Pkg.test("CUDAapi")
INFO: Testing CUDAapi
ERROR: LoadError: Could not find a suitable GCC
Stacktrace:
 [1] find_host_compiler(::VersionNumber) at /home/dss/.julia/v0.6/CUDAapi/src/discovery.jl:365
 [2] include_from_node1(::String) at ./loading.jl:576
 [3] include(::String) at ./sysimg.jl:14
 [4] process_options(::Base.JLOptions) at ./client.jl:305
 [5] _start() at ./client.jl:371
while loading /home/dss/.julia/v0.6/CUDAapi/test/runtests.jl, in expression starting on line 32
==========================================================[ ERROR: CUDAapi ]==========================================================

failed process: Process(`/home/dss/Downloads/julia-d386e40c17/bin/julia -Cx86-64 -J/home/dss/Downloads/julia-d386e40c17/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/dss/.julia/v0.6/CUDAapi/test/runtests.jl`, ProcessExited(1)) [1]

======================================================================================================================================
ERROR: CUDAapi had test errors

shell> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

shell> nvidia-smi | grep Version
| NVIDIA-SMI 390.30                 Driver Version: 390.30                    |

shell> gcc --version
gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

CUDA driver/toolkit compatibility

CUDAnative currently warns if the CUDA version as reported by the driver does not match what we get from NVCC. This is often a false positive, but these incompatibilities do apparently matter (until today, I hadn't run into issues with only CUDAnative). For example, cuBLAS fails to initialize if the driver is too old for the current toolkit:

$ nvidia-smi | grep Version
| NVIDIA-SMI 384.111                Driver Version: 384.111                   |
$ julia -e "using CUDAdrv; @show CUDAdrv.version()"                                                                                                                                                                      
CUDAdrv.version() = v"9.0.0"
$ julia -e 'println(ccall((:cublasCreate_v2, "/opt/cuda-9.0/lib64/libcublas.so"), Cuint, (Ptr{Ptr{Void}},), Ref{Ptr{Void}}()))'
0
$ julia -e 'println(ccall((:cublasCreate_v2, "/opt/cuda-9.1/lib64/libcublas.so"), Cuint, (Ptr{Ptr{Void}},), Ref{Ptr{Void}}()))'
1

As far as I know, these requirements aren't well documented. This post has a list:

CUDA 9.1: 387.xx
CUDA 9.0: 384.xx
CUDA 8.0  375.xx (GA2)
CUDA 8.0: 367.4x
CUDA 7.5: 352.xx
CUDA 7.0: 346.xx
CUDA 6.5: 340.xx
CUDA 6.0: 331.xx
CUDA 5.5: 319.xx
CUDA 5.0: 304.xx
CUDA 4.2: 295.41
CUDA 4.1: 285.05.33
CUDA 4.0: 270.41.19
CUDA 3.2: 260.19.26
CUDA 3.1: 256.40
CUDA 3.0: 195.36.15

But maybe we can just trust what the driver reports, but instead of warning when versions don't match, we should check whether the version is equal or higher than the toolkit version as reported by nvcc?

Cannot find cuDNN that was installed via anaconda (without sudo)

I'm using a machine without sudo access, so I've installed cuDNN via anaconda (which doesn't require sudo). CUDAdrv and CuArrays detect the GPU and run, however CUDAapi doesn't detect cuDNN so prevents Knet from using GPU

> using CUDAapi
> @show find_cuda_library("cudnn")
find_cuda_library("cudnn") = nothing
> using Knet; include(Knet.dir("test/gpu.jl")
...
cannot find cudnn

could not load library: "C:\Users\USER\.julia\artifacts\~~~\bin\cudnn64_7.dll"

The CUDA tool kit and cudnn were installed done! (The ENV variable was setting done!)

But the Fllux initialized or test(CUDAdrv, CUDAnative, CuArrays) are still can not found cudnn64_7.dll

using CUDAdrv, CUDAnative, CuArrays
Pkg.test(["CUDAdrv", "CUDAnative", "CuArrays"])

[ Info: Testing using device GeForce GTX 1650 SUPER (compute capability 7.5.0, 3.233 GiB available memory) on CUDA driver 11.0.0 and toolkit 10.2.89
multi dim, sliced setindex: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:51
  Got exception outside of a @test
  could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
  The specified module could not be found.

Stacktrace:

[1] #dlopen#3(::Bool, ::typeof(Libdl.dlopen), ::String, ::UInt32) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109
   [2] dlopen at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109 [inlined] (repeats 2 times)
   [3] use_artifact_cudnn(::VersionNumber) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:187
   [4] use_artifact_cuda() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:131
   [5] __configure_dependencies__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:235
   [6] __configure__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:117
   [7] (::CuArrays.var"#1#2"{Bool})() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:45
   [8] lock(::CuArrays.var"#1#2"{Bool}, ::ReentrantLock) at .\lock.jl:151
   [9] _functional(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:43
   [10] functional at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:36 [inlined]
   [11] macro expansion at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:63 [inlined]
   [12] libcurand at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:37 [inlined]
   [13] (::CuArrays.CURAND.var"#13122#cache_fptr!#11")() at C:\Users\USER\.julia\packages\CUDAapi\XuSHC\src\call.jl:31
   [14] curandCreateGenerator(::Base.RefValue{Ptr{Nothing}}, ::CuArrays.CURAND.curandRngType) at C:\Users\USER\.julia\packages\CUDAapi\XuSHC\src\call.jl:39
   [15] CuArrays.CURAND.RNG(::CuArrays.CURAND.curandRngType) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:23
   [16] RNG at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:22 [inlined]
   [17] #123 at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\CURAND.jl:34 [inlined]
   [18] get!(::CuArrays.CURAND.var"#123#124", ::IdDict{Any,Any}, ::Any) at .\abstractdict.jl:661
   [19] generator() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\CURAND.jl:33
   [20] rand!(::CuArray{Float32,4,Nothing}) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\rand\random.jl:166
   [21] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:54 [inlined]
   [22] macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Test\src\Test.jl:1107 [inlined]
   [23] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:52 [inlined]
   [24] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\src\host\indexing.jl:63 [inlined]
   [25] macro expansion at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:41 [inlined]
   [26] macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Test\src\Test.jl:1107 [inlined]
   [27] test_indexing(::Type) at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\indexing.jl:3
   [28] test(::Type{CuArray}) at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite.jl:51
   [29] top-level scope at C:\Users\USER\.julia\packages\CuArrays\e8PLr\test\runtests.jl:49
   [30] top-level scope at 
.
.
.
.
.
.

copyto! for triangular: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\linalg.jl:14
  Got exception outside of a @test
  could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
  The specified module could not be found.
  
  Stacktrace:
   [1] #dlopen#3(::Bool, ::typeof(Libdl.dlopen), ::String, ::UInt32) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109
   [2] dlopen at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Libdl\src\Libdl.jl:109 [inlined] (repeats 2 times)
   [3] use_artifact_cudnn(::VersionNumber) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:187
   [4] use_artifact_cuda() at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:131
   [5] __configure_dependencies__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\bindeps.jl:235
   [6] __configure__(::Bool) at C:\Users\USER\.julia\packages\CuArrays\e8PLr\src\CuArrays.jl:117
   [7] (::CuArrays.var"#1#2"{Bool})() at 
Float32 gemm C := adjoint(A) * transpose(B) * a + C * b: Error During Test at C:\Users\USER\.julia\packages\GPUArrays\QDGmr\test\testsuite\linalg.jl:83
  Test threw exception
  Expression: compare(mul!, AT, C, f(A), g(B), Ref(T(4)), Ref(T(5)))
  could not load library "C:\Users\USER\.julia\artifacts\d0bdf3cb548b47e0ea9808e88a130b8e4a924257\bin\cudnn64_7.dll"
  The specified module could not be found.

My Pkg.status():

 [fbb218c0] BSON v0.2.5
  [336ed68f] CSV v0.6.0
  [c5f51814] CUDAdrv v6.2.2
  [be33ccc6] CUDAnative v3.0.4
  [3a865a2d] CuArrays v2.0.1
  [1b08a953] Dash v0.1.0 #master (https://github.com/plotly/Dash.jl.git)
  [a93c6f00] DataFrames v0.20.2
  [1313f7d8] DataFramesMeta v0.5.0
  [587475ba] Flux v0.10.4
  [38e38edf] GLM v1.3.9
  [cd3eb016] HTTP v0.8.13
  [09f84164] HypothesisTests v0.9.2
  [7073ff75] IJulia v1.21.1
  [add582a8] MLJ v0.10.3
  [91a5bcdd] Plots v0.29.9

My versioninfo():

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: AMD Ryzen 5 3600X 6-Core Processor             
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, znver1)

Fix hardcoded CUDA versions to folder name autodetection?

On my machine could not identify CUDA libraries because of new CUDA toolkit version 10.1 absent in this list:

const cuda_versions = Dict(
    "toolkit"   => [v"1.0", v"1.1",
                    v"2.0", v"2.1", v"2.2",
                    v"3.0", v"3.1", v"3.2",
                    v"4.0", v"4.1", v"4.2",
                    v"5.0", v"5.5",
                    v"6.0", v"6.5",
                    v"7.0", v"7.5",
                    v"8.0",
                    v"9.0", v"9.1", v"9.2",
                    v"10.0"],
    "cudnn"     => [v"1.0",
                    v"2.0",
                    v"3.0",
                    v"4.0",
                    v"5.0", v"5.1",
                    v"6.0",
                    v"7.0", v"7.1", v"7.3", v"7.4"]
)

Maybe it will be simpler to autodetect library versions based on, e.g. folder name or from direct library call, instead of adding every new version to this list manually?

macOS CI for CUDA 10+

🍺  /usr/local/Cellar/gnu-tar/1.32: 15 files, 1.7MB, built in 1 minute 57 seconds
+sudo gtar -x --skip-old-files -f CUDAMacOSXInstaller/CUDAMacOSXInstaller.app/Contents/Resources/payload/cuda_mac_installer_tk.tar.gz -C /
gtar: CUDAMacOSXInstaller/CUDAMacOSXInstaller.app/Contents/Resources/payload/cuda_mac_installer_tk.tar.gz: Cannot open: No such file or directory
gtar: Error is not recoverable: exiting now

Cannot parse MSVC version string

I've used CUDAapi on JuliaPro 0.6.4.1, and it seems not work perfectly.

julia> Pkg.test("CUDAapi")
INFO: Testing CUDAapi
ERROR: LoadError: MethodError: no method matching getindex(::Void, ::Int64)
Stacktrace:
 [1] find_host_compiler(::Void) at E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\src\discovery.jl:407
 [2] find_host_compiler() at E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\src\discovery.jl:321
 [3] include_from_node1(::String) at .\loading.jl:576
 [4] include(::String) at .\sysimg.jl:14
 [5] process_options(::Base.JLOptions) at .\client.jl:305
 [6] _start() at .\client.jl:371
while loading E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\test\runtests.jl, in expression starting on line 32
=========================================[ ERROR: CUDAapi ]=========================================

failed process: Process(`'E:\Develop\IDEs\JuliaPro\Julia-0.6.4\bin\julia.exe' -Cx86-64 '-JE:\Develop\IDEs\JuliaPro\Julia-0.6.4\lib\julia\sys.dll' --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes 'E:\Develop\IDEs\JuliaPro\pkgs-0.6.4.1\v0.6\CUDAapi\test\runtests.jl'`, ProcessExited(1)) [1]

====================================================================================================
ERROR: CUDAapi had test errors

BTW, I use the vcvars64.bat provided by MS straightly on CMD command line.

julia> ENV["PATH"]
E:\\Develop\\IDEs\\JuliaPro\\Julia-0.6.4\\bin;C:\\Program Files (x86)\\MSBuild\\14.0\\bin\\amd64;C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\amd64; ……

shell> cl test.cpp
……
test.cpp
Microsoft (R) Incremental Linker Version 14.00.24225.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:test.exe
test.obj

shell> test

julia>

What should I do? @maleadt @musm

read(xxx,String) method error

I am getting method undefined errors on read(xxx,String) commands in find_host_compiler. Are you guys testing this on 0.6.1?

ERROR: MethodError: no method matching read(::Cmd, ::Type{String})

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.