Git Product home page Git Product logo

benchmark_spmv_using_csr5's People

Contributors

devilinchina avatar janecker avatar weifengliu-ssslab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

benchmark_spmv_using_csr5's Issues

Check... NO PASS! #Error on GPU k80c and Cuda 8.0

Hi,
I am trying to run CSR5 on a k80 machine. I am using Cuda-8.0 with compiler flag compute_60. Looks like it is running but not passing the test. Can you please have a look at this?

./spmv ../../../dataset/bcsstk33.mtx

PRECISION = 32-bit Single Precision

--------------../../../dataset/bcsstk33.mtx--------------
( 8738, 8738 ) nnz = 591904
cpu sequential time = 1.32787 ms. Bandwidth = 5.40169 GB/s. GFlops = 0.891508 GFlops.

Device [0] Tesla K80, @ 823.5MHz.
omega = 32, sigma = 32.
CSR->CSR5 malloc time = 0.566336 ms.
CSR->CSR5 tile_ptr time = 0.024704 ms.
CSR->CSR5 tile_desc time = 0.043552 ms.
CSR->CSR5 transpose time = 0.019552 ms.
CSR->CSR5 time = 0.851136 ms.
CSR5-based SpMV time = 0.00701181 ms. Bandwidth = 1022.95 GB/s. GFlops = 168.831 GFlops.
Check... NO PASS! #Error = 8738 out of 8738 entries.

function "__shfl_down(double, unsigned int, int)" has already been defined

Here is one more issue when building CSR5 with CUDA:

nvcc -O3  -w -m64 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_52,code=compute_52 main.cu -o spmv -I/usr/local/cuda-6.5/include -I/home/oberhuber/NVIDIA_CUDA-10.1_Samples/common/inc -L/usr/local/cuda-6.5/lib64 -lcudart -D VALUE_TYPE=float -D NUM_RUN=1000
detail/cuda/utils_cuda.h(71): error: function "__shfl_down(double, unsigned int, int)" has already been defined

detail/cuda/utils_cuda.h(80): error: function "__shfl_up(double, unsigned int, int)" has already been defined

detail/cuda/utils_cuda.h(89): error: function "__shfl_xor(double, int, int)" has already been defined

detail/cuda/utils_cuda.h(340): error: function "atomicAdd(double *, double)" has already been defined

4 errors detected in the compilation of "/tmp/tmpxft_0036955f_00000000-4_main.cpp4.ii".

Tomas O.

No instance of overloaded function "atomicAdd"

Dear developers,

I am having problems with building CUDA CSR5 benchmark in CSR5_cuda subfolder. There seems to be a problem with atomicAdd for double on architecture 5.2.

nvcc -O3  -w -m64 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_52,code=compute_52 main.cu -o spmv -I/usr/local/cuda-6.5/include -I/home/oberhuber/NVIDIA_CUDA-10.1_Samples/common/inc -L/usr/local/cuda-6.5/lib64 -lcudart -D VALUE_TYPE=double -D NUM_RUN=1000
detail/cuda/csr5_spmv_cuda.h(352): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (double *, double)
          detected during:
            instantiation of "void spmv_csr5_calibrate_kernel(const uiT *, const vT *, vT *, iT) [with iT=int, uiT=unsigned int, vT=double]" 

Best regards, Tomas.

Result not pass

Hi, thanks for your awesome work and I am very interested in the CSR5 format.

But when I tested the code (AVX2 version) in this repository, the program complained that the result is incorrect. I tested the code with webbase-1M.mtx and some other matrices downloaded from SuiteSparse Matrix Collection. This is the output:

------------------------------------------------------
PRECISION = 64-bit Double Precision
------------------------------------------------------
--------------../webbase-1M/webbase-1M.mtx--------------
 ( 1000005, 1000005 ) nnz = 3105536
cpu sequential time = 10.758 ms. Bandwidth = 6.8889 GB/s. GFlops = 0.577344 GFlops.

omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 4.077000 ms
CSR->CSR5 tile_ptr time = 29.580000 ms
CSR->CSR5 tile_desc time = 9.116000 ms
CSR->CSR5 transpose time = 5.595000 ms
omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 0.144000 ms
CSR->CSR5 tile_ptr time = 1.335000 ms
CSR->CSR5 tile_desc time = 4.970000 ms
CSR->CSR5 transpose time = 3.740000 ms
omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 0.126000 ms
CSR->CSR5 tile_ptr time = 1.243000 ms
CSR->CSR5 tile_desc time = 4.799000 ms
CSR->CSR5 transpose time = 3.132000 ms
omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 0.122000 ms
CSR->CSR5 tile_ptr time = 1.229000 ms
CSR->CSR5 tile_desc time = 4.324000 ms
CSR->CSR5 transpose time = 3.812000 ms
omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 0.109000 ms
CSR->CSR5 tile_ptr time = 1.020000 ms
CSR->CSR5 tile_desc time = 4.126000 ms
CSR->CSR5 transpose time = 2.930000 ms
omega = 4, sigma = 16. #partition = 48524
CSR->CSR5 malloc time = 0.107000 ms
CSR->CSR5 tile_ptr time = 0.926000 ms
CSR->CSR5 tile_desc time = 3.673000 ms
CSR->CSR5 transpose time = 3.280000 ms
CSR->CSR5 time = 8.053 ms.
CSR5-based SpMV time = 1.08296 ms. Bandwidth = 68.4337 GB/s. GFlops = 5.73529 GFlops.
Check... NO PASS! #Error = 3701 out of 1000005 entries.
------------------------------------------------------

The platform where I ran the code is a CentOS7 server with Intel Xeon E5-2680 v4, and the program was built icc (ICC) 19.1.2.254 20200623 (shipped with Intel System Studio 2020 Update 2. I am wondering that whether it is caused by the compiler.

Hope to get some hints.

AVX version

Hi,
I'm pretty interested in testing your CSR5 format, but unfortunately do not have an AVX2 cpu.
Do you possibly have an AVX version of csr5_spmv_avx2.h, which you could send me?

Best regards,
Andreas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.