Git Product home page Git Product logo

cusp-library's People

Contributors

filipemaia avatar jcohenpersonal avatar jonathan-cohen-nvidia avatar mjgarland avatar

cusp-library's Issues

#include <cusp/blas.h> breaks gcc compilation

What steps will reproduce the problem?
1. Consider a simple test case cusp_test.cpp:

#include <cusp/blas.h>

int main()
{
  return 0;
}

2. compile with

g++ -I /path/to/cusp -I /path/to/thrust -I /usr/local/cuda/include cusp_test.cpp

What is the expected output? What do you see instead?

compiler error:
In file included from /path/to/cusp/cusp/blas.h:199,
                 from cusp_test.cpp:1:
/path/to/cusp/cusp/detail/blas.inl: In function 'typename 
thrust::iterator_value<Iterator>::type cusp::blas::nrm2(InputIterator, 
InputIterator)':
/path/to/cusp/cusp/detail/blas.inl:395: error: 'sqrt' is not a member of 'std'

What version of the product are you using? On what operating system?

Tested on:

thrust 1150:9f5c19852f16
cusp 259:9fdf9bde9f6d

Ubuntu 8.04 2.6.24-24-server #1 SMP Wed Apr 15 15:41:09 UTC 2009 x86_64 
GNU/Linux
Ubuntu 9.10 2.6.31-22-generic #61-Ubuntu SMP Wed Jul 28 02:02:56 UTC 2010 
x86_64 GNU/Linux

g++-4.4 (Ubuntu 4.4.1-4ubuntu9) 4.4.1
g++-4.3 (Ubuntu 4.3.4-5ubuntu1) 4.3.4
g++-4.2 (GCC) 4.2.4 (Ubuntu 4.2.4-5ubuntu1)

Original issue reported on code.google.com by [email protected] on 12 Aug 2010 at 12:21

enable polymorphism in monitors and linear_operators

Based on the comments by Florian in this thread [1] on cusp-users it seems we 
should allow polymorphic usage of Cusp monitors and linear_operators.

[1] 
http://groups.google.com/group/cusp-users/browse_thread/thread/73dbba85ee83efb9

Original issue reported on code.google.com by wnbell on 26 Jul 2010 at 10:12

ell_to_csr conversion rounds ValueType to IndexType

Here's a quick patch:

Index: cusp/detail/host/conversion.h
===================================================================
--- cusp/detail/host/conversion.h   (revision 76)
+++ cusp/detail/host/conversion.h   (working copy)
@@ -328,7 +328,7 @@
         for(IndexType n = 0; n < num_entries_per_row; n++)
         {
             const IndexType j = src.column_indices(i,n);
-            const IndexType v = src.values(i,n);
+            const ValueType v = src.values(i,n);
             if(j != invalid_index)
             {
                 dst.column_indices[num_entries] = j;

Original issue reported on code.google.com by egastal on 15 Nov 2009 at 4:50

create cusp::permutation_matrix

Would store an array of integer indices encoding a permutation of [0,num_rows). 
 Specializing transpose(permutation_matrix,...) and 
multiply(permutation_matrix, ...) for matrix-vector and matrix-matrix 
multiplication would have a noticeable performance advantage.

Questions:
Can permutation_matrix be implemented more concisely as a coo_matrix_view?

Original issue reported on code.google.com by wnbell on 11 Oct 2010 at 4:31

Complex unit test fails with ambiguous operator+

The current fix for 'volatile complex' does not appear to work on MSVC.

On WinXP, MSVC 2005, nvcc 3.2.
$ cd testing; scons
.
.
.
complex.cu(333) : error C2593: 'operator +' is ambiguous
        c:/documents and settings/mgarland/my documents/work/cusp-library\cusp/c
omplex.h(547): could be 'cusp::complex<float> cusp::operator +<float>(volatile c
usp::complex<float> &,volatile cusp::complex<float> &)' [found using argument-de
pendent lookup]
        c:/documents and settings/mgarland/my documents/work/cusp-library\cusp/c
omplex.h(533): or       'cusp::complex<float> cusp::operator +<float>(const cusp
::complex<float> &,const cusp::complex<float> &)' [found using argument-dependen
t lookup]
        while trying to match the argument list '(cusp::complex<float>, cusp::co
mplex<float>)'
        complex.cu(456) : see reference to function template instantiation 'cusp
::complex<float> test_complex_compilation_entry<ValueType>(void)' being compiled

        with
        [
            ValueType=float
        ]
        complex.cu(469) : see reference to function template instantiation 'void
 TestComplex<float>(void)' being compiled
complex.cu(333) : error C2593: 'operator +' is ambiguous
        c:/documents and settings/mgarland/my documents/work/cusp-library\cusp/c
omplex.h(547): could be 'cusp::complex<double> cusp::operator +<double>(volatile
 cusp::complex<double> &,volatile cusp::complex<double> &)' [found using argumen
t-dependent lookup]
        c:/documents and settings/mgarland/my documents/work/cusp-library\cusp/c
omplex.h(533): or       'cusp::complex<double> cusp::operator +<double>(const cu
sp::complex<double> &,const cusp::complex<double> &)' [found using argument-depe
ndent lookup]
        while trying to match the argument list '(cusp::complex<double>, cusp::c
omplex<double>)'
        complex.cu(456) : see reference to function template instantiation 'cusp
::complex<double> test_complex_compilation_entry<ValueType>(void)' being compile
d
        with
        [
            ValueType=double
        ]
        complex.cu(469) : see reference to function template instantiation 'void
 TestComplex<double>(void)' being compiled
scons: *** [complex.obj] Error 2
scons: building terminated because of errors.

Original issue reported on code.google.com by [email protected] on 16 Nov 2010 at 4:41

make array2d support non-trivial stride

Specifically, array2d and array2d_view should not assume that the stride is 
equal to the minor dimension.

Original issue reported on code.google.com by wnbell on 11 Oct 2010 at 3:09

device::spmm_coo requires huge anount of memory

What steps will reproduce the problem?
1. Apply the smoothed_aggregation preconditioner to a big enough matrix.

What is the expected output? What do you see instead?
The solver fails with bad::alloc because the GPU memory is exhausted by 
cusp::detail::device::spmm_coo.


What version of the product are you using? On what operating system?
cusp v0.1.1
thrust v1.3.0
nvidia cuda 3.0
windows xp professional x64 edition

Please provide any additional information below.
I think, the problem is general, but if you need, I can upload my particular 
matrix. It only has 8M of nonzeros and fails on a GTX285 card which has 1GB RAM.

I wonder if the spmm algorithm could be optimized or avoided in 
smoothed_aggregation by keeping the original matrices, let's say A and B, and 
computing y=A*(B*x) instead of y=(A*B)*x?

Thanks!

Original issue reported on code.google.com by [email protected] on 11 Oct 2010 at 9:22

Triangular Backsubstitution

Is there an easy way to implement an efficient parallel sparse triangular
matrix backsubstitution step using cusp/thrust in the library? This would
enable the utilization of more elaborate preconditioners like SSOR or
custom incomplete Cholesky / LU to be easily integrated into the krylov pcg
solver.

Original issue reported on code.google.com by [email protected] on 19 Feb 2010 at 3:47

bug in testing/complex.cu ?

Hello,

At line 368 in testing/complex.cu, you test if the architecture supports the 
double precision, hence is higher than 1.3, but the test seems wrong if you 
have a 2.0 for example.

Thanks a lot for your great job!
Luc.


Original issue reported on code.google.com by [email protected] on 19 Oct 2010 at 11:44

exploit sorted rows in COO to CSR conversion

currently the host code assumes the row entries are unsorted and first
computes a histogram, etc.

Original issue reported on code.google.com by wnbell on 26 Mar 2010 at 10:06

convergence_monitor should handle multiple calls to .finished() per iteration

In light of issue #27 it seems we'll need to make convergence_monitor [1] 
handle multiple calls to .finished() per iteration.  Possible remedies are

1) Record all residual norms and use iteration_count() to compute the 
convergence rates.
2) Record only one residual norm per iteration
3) Record both the iteration number and norm for all calls to .finished()

For #2 we could keep the first, last or minimum norm by default and support 
general BinaryOperators so any behavior can be handled.

My initial preference is for #2.  The problem with #1 is that the 
immediate_rate() computation is no longer straightforward.  The last option 
keeps the most information but is probably not what most people want.  Option 
#2 also has the nice property that (for reasonable BinaryOperators) any 
redundant calls to finished() won't affect the results.

So, I like #2, but I haven't given much thought to how monitors will work with 
multiple RHS or handle methods like GMRES.

Thoughts?

[1] http://code.google.com/p/cusp-library/source/browse/cusp/monitor.h#256

Original issue reported on code.google.com by wnbell on 14 Nov 2010 at 7:29

ell_matrix.h is missing cusp::minimum_space

Example MatrixFormats/ell.cu doesn't currently compile:

./cusp/ell_matrix.h(206): error: namespace "cusp" has no member "minimum_space"

Adding #include <cusp/memory.h> to cusp/ell_matrix.h seems to fix the problem, 
but I don't know if there's something deeper going on.

Original issue reported on code.google.com by [email protected] on 11 Nov 2010 at 1:32

add support for matrix addition and subtraction

The most general form might allow users to specify whether to filter out 
explicit zeros (defaulting to true).

Original issue reported on code.google.com by wnbell on 26 Jul 2010 at 11:18

silence compiler warnings

In particular
  warning: comparison between signed and unsigned integer expressions

It may be best to use size_t for all the matrix dimensions.

Original issue reported on code.google.com by wnbell on 9 Nov 2010 at 5:42

support for complex values

following the implementation of std::complex<T>

Specializations for float and double should use float2 and double2 for 
coalescing.

Original issue reported on code.google.com by wnbell on 31 Jul 2010 at 6:12

modulate vector size in CSR (vector)

We should choose THREADS_PER_VECTOR=2,4,8,16,32 based on num_entries/num_rows

http://code.google.com/p/cusp-library/source/browse/trunk/cusp/detail/device/spm
v/csr_vector.h

Original issue reported on code.google.com by wnbell on 3 Apr 2010 at 6:39

Consider data alignment in CSR (vector) kernel

The CSR (vector) kernel does not consider alignment when accessing the
matrix arrays.  For matrices with large row sizes are likely to see a
significant benefit.

Original issue reported on code.google.com by wnbell on 10 Dec 2009 at 6:34

bicgstab can't solve Ix = b

Cusp's bicgstab can't solve the system Ix = b.  Instead of converging in one 
step, it yields NaN's.

For example, take "examples/Solvers/bicgstab.cu"

    http://code.google.com/p/cusp-library/source/browse/examples/Solvers/bicgstab.cu

Change line 32,

-    cusp::krylov::bicgstab(A, x, b, monitor, M);
+    cusp::krylov::bicgstab(M, x, b, monitor, M);

The solver fails to converge.

Original issue reported on code.google.com by [email protected] on 16 Aug 2010 at 12:33

OpenMP support

Provide an OpenMP backend like Thrust's

Original issue reported on code.google.com by wnbell on 6 Aug 2010 at 9:56

MatrixMarket file reader fails on OSX

Appears to be a problem with std::getline()

http://lists.apple.com/archives/cocoa-dev/2009/Sep/msg01199.html
http://stackoverflow.com/questions/2234557/c-using-getline-prints-pointer-being-
freed-was-not-allocated-in-xcode

Reported in cusp-user thread:
http://groups.google.com/group/cusp-users/browse_thread/thread/6aa484f775da2070?
hl=en

Original issue reported on code.google.com by wnbell on 8 May 2010 at 10:08

TestComplexDouble fails for arch=sm_10

$ cd testing
$ scons        # arch=sm_10 is the default
$ ./tester     # Tesla C2050, CUDA 3.1, Linux & Windows

================================================================
FAILURE: TestComplexDouble
[complex.cu:452] values are not approximately equal: 1.41894 5.38788e-315 [type=
'double']
================================================================

This test should probably just be disabled when arch=sm_10, since sm_13 is 
required for double support.  The code appears to be trying to do that, but it 
is unsuccessful.

Original issue reported on code.google.com by [email protected] on 26 Oct 2010 at 4:15

Missing arch.h

cusp/detail/device/arch.h is missing after revision 57fd568deacf which requires 
it.

Original issue reported on code.google.com by [email protected] on 4 Oct 2010 at 6:55

Compile errors

I get this compiling PETSc with the latest Thrust and Cusp:

knepley@Matthew-Knepleys-MacBook-Air:/PETSc3/petsc/petsc-dev/src/mat/impls/aij/s
eq/seqcuda$ make PETSC_DIR=/PETSc3/petsc/petsc-dev PETSC_ARCH=darwin-cuda
make PETSC_DIR=/PETSc3/petsc/petsc-dev PETSC_ARCH=darwin-cuda
nvcc -m64  -c --compiler-options="-PIC -Wall -Wwrite-strings 
-Wno-strict-aliasing -Wno-unknown-pragmas -g3   
-I/PETSc3/petsc/petsc-dev/darwin-cuda/include -I/PETSc3/petsc/petsc-dev/include 
-I/usr/local/cuda/include -I/usr/local/include -I/PETSc3/multicore/cusp/ 
-I/PETSc3/multicore/thrust/ -I/PETSc3/petsc/petsc-dev/include/mpiuni      
-D__INSDIR__=src/mat/impls/aij/seq/seqcuda/" aijcuda.cu
/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(37
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(45
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(51
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(54
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(54
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(58
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(63
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(63
): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(75
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(75
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(75
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(76
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(76
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(76
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(77
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(77
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(77
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(78
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(78
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(78
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(79
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(79
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(79
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(80
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(80
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(80
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(81
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(81
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(81
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(82
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(82
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(82
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(83
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(83
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(83
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(84
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(84
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(84
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(85
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(85
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(85
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(86
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(86
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/vector_types.h(86
): error: expected either a definition or a tag name

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(124): error: "DecodeDigits" is not a function or static data 
member

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(125): error: expected a ";"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(130): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(184): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(197): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(198): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(199): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(200): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(401): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(401): error: expected a type specifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(455): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(455): error: expected a type specifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(501): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(501): error: expected a type specifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(518): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(518): error: expected a type specifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(549): error: "SwapAndScatterSm13" is not a function or static 
data member

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(550): error: expected a ";"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(565): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(568): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(577): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(579): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: identifier "K" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: identifier "UNGUARDED_IO" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: identifier "PASSES_PER_CYCLE" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: identifier "SETS_PER_PASS" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: identifier "PostprocessFunctor" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(580): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(583): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(600): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(605): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(666): error: "SwapAndScatterPairs" is not a function or static 
data member

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(667): error: expected a ";"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(814): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(815): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(822): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(847): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(863): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(866): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(875): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(878): error: identifier "WarpScan" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(878): error: identifier "RADIX_DIGITS" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(878): error: expected an identifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(879): error: identifier "digit_scan" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(880): error: identifier "inclusive_total" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(881): error: expected a type specifier

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(884): error: this operator is not allowed in an integral constant 
expression

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(884): error: identifier "my_carry" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(887): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(889): warning: this pragma must immediately precede a statement

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(894): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(896): error: explicit type is missing ("int" assumed)

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(896): error: cannot overload functions distinguished by return 
type alone

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: UpdateRanks is not a template

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: identifier "RADIX_DIGITS" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: identifier "PASSES_PER_CYCLE" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: identifier "SETS_PER_PASS" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: expected a ")"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(902): error: expected a ";"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(911): error: SwapAndScatterSm10 is not a template

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(911): error: identifier "K" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(911): error: identifier "V" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(911): error: identifier "UNGUARDED_IO" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(911): error: identifier "PostprocessFunctor" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(912): error: expected a ")"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(913): error: variable "ranks" has already been defined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(920): error: expected a ";"

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(936): error: explicit type is missing ("int" assumed)

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(936): error: cannot overload functions distinguished by return 
type alone

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(938): error: expected a declaration

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(978): warning: parsing restarts here after previous syntax error

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(980): error: identifier "LOG_RAKING_THREADS_PER_PASS" is undefined

/PETSc3/multicore/thrust/thrust/detail/device/cuda/detail/b40c/radixsort_scansca
tter_kernel.h(980): error: identifier "LOG_SCAN_LANES_PER_PASS" is undefined


Original issue reported on code.google.com by [email protected] on 4 Oct 2010 at 8:27

Attachments:

testing/complex.cu fails to compile with MSVC

$ cd testing
$ scons
.
.
complex.cu(451): error: identifier "is_same_type" is undefined
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=float]"
(460): here

complex.cu(451): error: type name is not allowed
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=float]"
(460): here

complex.cu(451): error: type name is not allowed
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=float]"
(460): here

complex.cu(451): error: the global scope has no "result"
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=float]"
(460): here

complex.cu(451): error: identifier "is_same_type" is undefined
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=double]"
(460): here

complex.cu(451): error: type name is not allowed
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=double]"
(460): here

complex.cu(451): error: type name is not allowed
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=double]"
(460): here

complex.cu(451): error: the global scope has no "result"
          detected during instantiation of "void TestComplex<ValueType>() [with
ValueType=double]"
(460): here

Original issue reported on code.google.com by [email protected] on 20 Oct 2010 at 4:22

blas functions should check input sizes

Sizes of arrays passed to cusp::blas:: functions should be checked for
consistency.  An exception should be raised if sizes are inconsistent.

Original issue reported on code.google.com by wnbell on 28 Oct 2009 at 5:50

ensure coo_matrix has sorted indices

Our spmv kernels require coo_matrix row indices to be in sorted order. 
Whenever a coo_matrix is created (e.g. read from disk or converted from
another format) it should satisfy this expectation.

Original issue reported on code.google.com by wnbell on 10 Oct 2009 at 8:37

design and implement cusp::graph::induced_subgraph

Proposed function signature:

template <typename Matrix1,
          typename Array1,
          typename Matrix2>
size_t induced_subgraph(const Matrix1& G,
                        const Array1& stencil,
                              Matrix2& S);

Assumptions:
 G is an NxN matrix
 stencil is an array with length N whose value_type is convertible to bool
 S can be .resize()-ed to hold the induced subgraph

Semantics:
 Returns K, where K is the number of true values in stencil
 S will be a KxK matrix
 The valid vertices of G will be mapped to [0,K) in order
 For each edge (i,j) in G where (stencil[i] && stencil[j]) is true the edge (M[i],M[j]) will exist in S, where M is the mapping above


Possible extensions:

template <typename Matrix1,
          typename Array1,
          typename Matrix2,
          typename Array2>
size_t induced_subgraph(const Matrix1& G,
                        const Array1& stencil,
                              Matrix2& S,
                              Array2& map);

Here output array map stores the mapping from S vertices to G vertices.  Note 
this is not the mapping mentioned in the Semantics section above.

Original issue reported on code.google.com by wnbell on 29 Oct 2010 at 5:27

smoothed_aggregation.inl has hard-coded single-precision

Problem:
Trying to use the smoothed_aggregation preconditioner with double-precision 
vectors/matrix doesn't work because there are scalar arguments in _solve 
hardcoded to be float (e.g, 1.0f) which are ruining accuracy.  This can be 
solved by casting to the templated "ValueType" as has been done elsewhere.


What version of the product are you using? On what operating system?
The latest mercurial pull on both Max OS X and Ubuntu

Original issue reported on code.google.com by [email protected] on 11 Nov 2010 at 1:53

dia spmv fails for matrices with many diagonals

For example

nathan@droog:~/NV/cusp/performance/spmv$ ./spmv
~/NV/research/datasets/matrices/cell_paper/matrix_market/dense2.mtx

Computing SpMV with 'float' values.

There are 2 devices supporting CUDA

Device 0: "Tesla C1060"
  Major revision number:                         1
  Minor revision number:                         3
  Total amount of global memory:                 4294705152 bytes

Device 1: "GeForce 8800 GT"
  Major revision number:                         1
  Minor revision number:                         1
  Total amount of global memory:                 536150016 bytes

Running on Device 0

Read matrix
(/home/nathan/NV/research/datasets/matrices/cell_paper/matrix_market/dense2.mtx)
with shape (2000,2000) and 4000000 entries

    coo_flat            :   1.3561 ms (  5.90 GFLOP/s  47.2 GB/s) [L2 error
0.000000]
    coo_flat_tex        :   1.3534 ms (  5.91 GFLOP/s  47.3 GB/s) [L2 error
0.000000]
    coo_flat_k          :   1.3874 ms (  5.77 GFLOP/s  46.1 GB/s) [L2 error
0.000000]
    coo_flat_k_tex      :   1.2837 ms (  6.23 GFLOP/s  49.9 GB/s) [L2 error
0.000000]
    csr_scalar          :  10.8679 ms (  0.74 GFLOP/s   4.4 GB/s) [L2 error
0.000000]
    csr_scalar_tex      :  10.4112 ms (  0.77 GFLOP/s   4.6 GB/s) [L2 error
0.000000]
    csr_vector          :   0.7620 ms ( 10.50 GFLOP/s  63.0 GB/s) [L2 error
0.000000]
    csr_vector_tex      :   0.5633 ms ( 14.20 GFLOP/s  85.3 GB/s) [L2 error
0.000000]
    dia                 :   2.3368 ms (  3.42 GFLOP/s  13.7 GB/s) [L2 error
1.000000]
    dia_tex             :   2.2275 ms (  3.59 GFLOP/s  14.4 GB/s) [L2 error
1.000000]
    ell                 :   1.9842 ms (  4.03 GFLOP/s  24.2 GB/s) [L2 error
0.000000]
    ell_tex             :   1.9803 ms (  4.04 GFLOP/s  24.2 GB/s) [L2 error
0.000000]
    hyb                 :   1.2869 ms (  6.22 GFLOP/s  49.8 GB/s) [L2 error
0.000000]
    hyb_tex             :   1.2705 ms (  6.30 GFLOP/s  50.4 GB/s) [L2 error
0.000000]

Original issue reported on code.google.com by wnbell on 4 Mar 2010 at 5:43

build-env.py problem in macosx

build-env.py copies LD_LIBRARY_PATH in posix systems, but macosx uses 
DYLD_LIBRARY_PATH.
The attached patch checks for this and changes the assignment.

Original issue reported on code.google.com by [email protected] on 1 Oct 2010 at 6:22

Attachments:

Matrix Transpose

Sparse matrices are quite cheap to transpose, and it can be done in place.
It would be great if the library could provide such a function. 

Specifically in the DIA format, the access to the elements changes,
however, if only the indices of the diagonals are inverted. Thus, the
matrix should "know" whether it is transposed, making this an "intrusive"
change. 

I wrote transposed access before your major restructuring and it worked
without loss of speed, but I'd rather see this in the "main" branch...

Original issue reported on code.google.com by [email protected] on 11 Dec 2009 at 1:15

support cusp::complex in cusp::multiply()

In particular, the sparse matrix-vector multiplication routines need to support 
cusp::complex [1].

[1] 
http://groups.google.com/group/cusp-users/browse_thread/thread/6cabe7d6078ff33c

Original issue reported on code.google.com by wnbell on 6 Nov 2010 at 9:58

use CUDA FP instrinsics to prohibit FMAD truncation

   (Appendix G.2)  Addition and multiplication are often combined
   into a single multiply-add instruction (FMAD), which truncates 
   (i.e. without rounding) the intermediate mantissa of the multiplication;
   Source:
http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/NVIDIA_CUDA_P
rogrammingGuide.pdf

Original issue reported on code.google.com by wnbell on 17 May 2010 at 10:46

support general elementwise binary operators

Provide a general transform_elementwise function which implements arbitrary 
binary transformations between sparse matrices.

Original issue reported on code.google.com by wnbell on 3 Aug 2010 at 8:28

CSR matrix multiply on device

What steps will reproduce the problem?
1. Compile and run test.cu (in attachments)

What is the expected output? What do you see instead?
Sparse matrix multiply is working on the host (for CSR), but not on the device.

What version of the product are you using? On what operating system?
0.1, Fedora 10, kernel : 2.6.27.5-117.fc10.i686.PAE

Please provide any additional information below.
I may be doing something wrong, but I could only get this point from the
documentation available. You can contact me @ pyalama (at) g (dot) clemson
(dot) edu

Original issue reported on code.google.com by [email protected] on 11 May 2010 at 5:56

Attachments:

Conjugate Gradient Solver

Hi,

I am using the Conjugate Gradient Solver cg.cu to solve A*x=b.
I changed it a little bit so that it could initialize the vector b from reading 
a matrix market file as well. It works fine with the matrix A in the given 
example. However, when I changed the matrix to my own sparse matrix, it always 
tends to further way (fails to converge). On the other hand, I can get the 
solution when I use Matlab to solve it. Thus, I am wondering if there is any 
specific requirement to the matrix by using Conjugate Gradient method?

What version of the product are you using? On what operating system?

I am using CUDA3.0, cusp0.1 and thrust1.2. The operating system is Windows7.

Thank you

Original issue reported on code.google.com by [email protected] on 8 Jul 2010 at 7:59

add cusp::print()

A cusp::print function could either replace cusp::print_matrix just call 
print_matrix using default arguments.

If cusp::print_matrix survives it needs to offer printing options, such as 
dense vs. sparse output.

cusp::print should work for at least all the cusp containers and views.

Possible function signatures:
  template <typename Matrix>
  cusp::print(const Matrix& A);

  template <typename Matrix, typename Stream>
  cusp::print(const Matrix& A, Stream& ostream);

Original issue reported on code.google.com by wnbell on 22 Oct 2010 at 2:53

document cusp::graph::maximal_independent_set

Reported by Jordan on cusp-users

http://groups.google.com/group/cusp-users/browse_thread/thread/f92f73ee66439de7

Original issue reported on code.google.com by wnbell on 6 Aug 2010 at 5:03

Thrust version

Hi,

I get the following issues when I try to compile 

1. compile the version.cu returns wrong thrust version: v1.1 instead of v1.2. I 
downloaded thrust from  http://thrust.googlecode.com/files/thrust-v1.2.zip

2. compile of any example results in the following error output which I think 
is related to the wrong thrust version:

:nvcc cg.cu -o cg -I /home/fdelpin/GPU/
/usr/local/cuda/bin/../include/thrust/sorting/detail/device/cuda/stable_merge_so
rt.inl(94): error: expression must have a constant value

/home/fdelpin/GPU/thrust/iterator/permutation_iterator.h(152): error: namespace 
"thrust::detail" has no member "enable_if_convertible"

/home/fdelpin/GPU/thrust/iterator/permutation_iterator.h(152): error: expected 
a ")"

/home/fdelpin/GPU/thrust/iterator/detail/permutation_iterator.inl(31): error: 
namespace "thrust::detail::device" has no member "dereference_result"

/home/fdelpin/GPU/thrust/iterator/detail/permutation_iterator.inl(31): error: 
expected an identifier

/home/fdelpin/GPU/thrust/iterator/detail/permutation_iterator.inl(46): error: a 
template argument list is not allowed in a declaration of a primary template

/home/fdelpin/GPU/thrust/iterator/detail/permutation_iterator.inl(54): error: 
dereference_result is not a template

/home/fdelpin/GPU/thrust/iterator/detail/permutation_iterator.inl(63): error: 
dereference_result is not a template

8 errors detected in the compilation of 
"/tmp/tmpxft_000011ea_00000000-4_cg.cpp1.ii".

I'm using CUSP V0.1 on linux SuSe 11

Thanks!

Original issue reported on code.google.com by [email protected] on 14 Jun 2010 at 9:32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.