supranational / sppark Goto Github PK
View Code? Open in Web Editor NEWZero-knowledge template library
License: Apache License 2.0
Zero-knowledge template library
License: Apache License 2.0
error occurred: Command "nvcc" "-ccbin=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\cl.exe" "-Xcompiler" "-nologo" "-Xcompiler" "-MD" "-G" "-Xcompiler" "-Z7" "-Xcompiler" "-Brepro" "-I" "D:\hash\aleo\sppark-0.1.5" "-Xcompiler" "-W4" "-o" "D:\hash\aleo\sppark-0.1.5\rust\target\debug\build\sppark-9d1ea8dff2ab2e4d\out\src/lib.o" "-c" "src/lib.cpp" with args "nvcc" did not execute successfully (status code exit code: 2).
windows10 ,nvcc=12.3,gcc = 13.2, rustc = 1.75.0 gpu= rtx4060,
// Maximum domain size supported. Can be adjusted at will, but with the
// target field in mind. Most fields handle up to 2^32 elements, BLS12-377
// can handle up to 2^47, alt_bn128 - 2^28...
I noted that there are some comments about NTT implementation in curve alt_bn128, but it's currently undone. Is there anything special (parameter) to cope with compared with BLS_377 and BLS_381? Will this part of implementation be published soon?
# include <blst_t.hpp>
in sppark/ff/bls12-381.hpp
fails as this file is nowhere defined.
The following warnings were emitted during compilation:
warning: In file included from /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:13,
warning: from src/lib.cpp:2:
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh: In member function ‘void event_t::wait(cudaStream_t)’:
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:45:50: error: too few arguments to function ‘cudaError_t cudaStreamWaitEvent(cudaStream_t, cudaEvent_t, unsigned int)’
warning: { CUDA_OK(cudaStreamWaitEvent(stream, event)); }
warning: ^
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/exception.cuh:13:24: note: in definition of macro ‘CUDA_OK’
warning: cudaError_t code = expr;
warning: ^~~~
warning: In file included from /usr/include/channel_descriptor.h:61,
warning: from /usr/include/cuda_runtime.h:95,
warning: from src/lib.cpp:1:
warning: /usr/include/cuda_runtime_api.h:2240:58: note: declared here
warning: extern host cudart_builtin cudaError_t CUDARTAPI cudaStreamWaitEvent(cudaStream_t stream, cudaEvent_t event, unsigned int flags);
warning: ^~~~~~~~~~~~~~~~~~~
warning: In file included from /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:13,
warning: from src/lib.cpp:2:
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh: In member function ‘void* stream_t::Dmalloc(size_t) const’:
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:72:17: error: ‘cudaMallocAsync’ was not declared in this scope
warning: CUDA_OK(cudaMallocAsync(&d_ptr, sz, stream));
warning: ^~~~~~~~~~~~~~~
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/exception.cuh:13:24: note: in definition of macro ‘CUDA_OK’
warning: cudaError_t code = expr;
warning: ^~~~
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:72:17: note: suggested alternative: ‘cudaMallocArray’
warning: CUDA_OK(cudaMallocAsync(&d_ptr, sz, stream));
warning: ^~~~~~~~~~~~~~~
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/exception.cuh:13:24: note: in definition of macro ‘CUDA_OK’
warning: cudaError_t code = expr;
warning: ^~~~
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh: In member function ‘void stream_t::Dfree(void*) const’:
warning: /root/.cargo/registry/src/rsproxy.cn-8f6827c7555bfaf8/sppark-0.1.6/sppark/util/gpu_t.cuh:76:17: error: ‘cudaFreeAsync’ was not declared in this scope
warning: { CUDA_OK(cudaFreeAsync(d_ptr, stream)); }
For instance, the type vec384
misses a definition
Line 82 in fffd734
I am trying to use Ray (https://github.com/ray-project/ray) to wrap the RISC Zero prover, which internally uses sppark
. An error that I encountered is as follows:
(async_join_segment_receipts pid=2928) *** SIGSEGV received at time=1706179489 on cpu 0 *** [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) PC: @ 0x7f79a0fd870c (unknown) select_gpu() [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) @ 0x7f7edbcb2090 (unknown) (unknown) [repeated 2x across cluster]
(async_join_segment_receipts pid=2928) @ 0x7f79a0fd57d0 160 sppark_init [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) @ 0x7f79a09a6529 (unknown) risc0_zkp::hal::cuda::CudaHal$LT$CH$GT$::new::hfb16181478125328 [repeated 4x across cluster]
(async_join_segment_receipts pid=2928) [2024-01-25 02:44:49,148 E 2928 2928] logging.cc:361: *** SIGSEGV received at time=1706179489 on cpu 0 *** [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) [2024-01-25 02:44:49,148 E 2928 2928] logging.cc:361: PC: @ 0x7f79a0fd870c (unknown) select_gpu() [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) [2024-01-25 02:44:49,148 E 2928 2928] logging.cc:361: @ 0x7f7edbcb2090 (unknown) (unknown) [repeated 2x across cluster]
(async_join_segment_receipts pid=2928) [2024-01-25 02:44:49,148 E 2928 2928] logging.cc:361: @ 0x7f79a0fd57d0 160 sppark_init [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) [2024-01-25 02:44:49,148 E 2928 2928] logging.cc:361: @ 0x7f79a09a6529 (unknown) risc0_zkp::hal::cuda::CudaHal$LT$CH$GT$::new::hfb16181478125328 [repeated 4x across cluster]
(async_join_segment_receipts pid=2928) Fatal Python error: Segmentation fault [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) Stack (most recent call first): [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) File "ray_test.py", line 29 in async_join_segment_receipts [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) File "/home/ubuntu/.pyenv/versions/pyo3/lib/python3.8/site-packages/ray/_private/worker.py", line 847 in main_loop [repeated 3x across cluster]
(async_join_segment_receipts pid=2928) File "/home/ubuntu/.pyenv/versions/pyo3/lib/python3.8/site-packages/ray/_private/workers/default_worker.py", line 282 in <module> [repeated 3x across cluster]
packages/ray/_private/workers/default_worker.py", line 282 in <module> [repess cluster]
The expectation is that Ray forks processes. And therefore, if a parent process initializes gpus
already, the child would use the ones that are already initialized by the parent. However, NVIDIA doesn't work when this happens---the contexts and many other data structures are somewhat specific to the processes.
I am still checking if this is the reason of the error happening.
Revealed in 06564db, though the error might have been there before (perhaps the macro was unused).
huitseeker@tencendur➜tmp/sppark/rust(main)» cargo +1.60.0 check [16:42:30]
Compiling libc v0.2.147
Compiling either v1.8.1
Compiling cc v1.0.79
Compiling which v4.4.0
Compiling sppark v0.1.3 (/home/huitseeker/tmp/sppark/rust)
error[E0308]: mismatched types
--> src/lib.rs:26:59
|
26 | let c_str = unsafe { std::ffi::CStr::from_ptr(str.as_ptr()) };
| ^^^^^^^^^^^^ expected `u8`, found `i8`
|
= note: expected raw pointer `*const u8`
found raw pointer `*mut i8`
For more information about this error, try `rustc --explain E0308`.
error: could not compile `sppark` due to previous error
huitseeker@tencendur➜tmp/sppark/rust(main)» cargo +1.68.2 check [16:42:49]
Compiling sppark v0.1.3 (/home/huitseeker/tmp/sppark/rust)
error[E0308]: mismatched types
--> src/lib.rs:26:59
|
26 | let c_str = unsafe { std::ffi::CStr::from_ptr(str.as_ptr()) };
| ------------------------ ^^^^^^^^^^^^ expected `u8`, found `i8`
| |
| arguments to this function are incorrect
|
= note: expected raw pointer `*const u8`
found raw pointer `*mut i8`
note: associated function defined here
--> /home/huitseeker/.rustup/toolchains/1.68.2-aarch64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ffi/c_str.rs:256:25
|
256 | pub const unsafe fn from_ptr<'a>(ptr: *const c_char) -> &'a CStr {
| ^^^^^^^^
For more information about this error, try `rustc --explain E0308`.
error: could not compile `sppark` due to previous error
huitseeker@tencendur➜tmp/sppark/rust(main)» cargo +1.70.0 check [16:42:58]
Compiling sppark v0.1.3 (/home/huitseeker/tmp/sppark/rust)
error[E0308]: mismatched types
--> src/lib.rs:26:59
|
26 | let c_str = unsafe { std::ffi::CStr::from_ptr(str.as_ptr()) };
| ------------------------ ^^^^^^^^^^^^ expected `*const u8`, found `*mut i8`
| |
| arguments to this function are incorrect
|
= note: expected raw pointer `*const u8`
found raw pointer `*mut i8`
note: associated function defined here
--> /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/ffi/c_str.rs:260:25
For more information about this error, try `rustc --explain E0308`.
error: could not compile `sppark` (lib) due to previous error
I noticed the sppark code on accelerating msm on GPU, it's great. but I met a problem in that the case msm_correctness failed after changing the generated points length.
the command is : cargo test --features=bls12_377
the code is:
fn msm_correctness() {
let test_npow = std::env::var("TEST_NPOW").unwrap_or("15".to_string());
let npoints_npow = i32::from_str(&test_npow).unwrap();
let (points, scalars) =
**//util::generate_points_scalars::<G1Affine>(1usize << npoints_npow);
util::generate_points_scalars::<G1Affine>(237);**
let msm_result = multi_scalar_mult_arkworks(points.as_slice(), unsafe {
std::mem::transmute::<&[_], &[BigInteger256]>(scalars.as_slice())
})
.into_affine();
let arkworks_result =
VariableBaseMSM::multi_scalar_mul(points.as_slice(), unsafe {
std::mem::transmute::<&[_], &[BigInteger256]>(scalars.as_slice())
})
.into_affine();
assert_eq!(msm_result, arkworks_result);
}
Currently the cuda errors impl From<Error> for String
. If instead it would implement Display
(see https://rust-lang.github.io/api-guidelines/interoperability.html?highlight=display#error-types-are-meaningful-and-well-behaved-c-good-err for more information), you could e.g. also call err.to_string()
or even use it like println!("error was : {}", err)
.
If I should provide a pull request, let me know.
This repository contains a proof of concept for using the MSM functionality from Rust. It's in a directory called blst-cuda
, although the crate is called blst-msm
. That poc only works with arkworks types, not with blst(rs) types. All this is confusing. Hence I propose:
blst-cuda
to arkworks-msm
blst-msm
to arkworks-msm
blst-msm
that is within a directory of the same name. That crate would work with blst types.This issue was triggered by #4.
When i observed the implementation of Montgomery multiplication over finite fields, I'm very confused about
static inline void mad_n_redc(uint32_t even[], uint32_t odd[],
const uint32_t a[], uint32_t bi, bool first=false)
{
if (first) {
mul_n(odd, a+1, bi);
mul_n(even, a, bi);
} else {
asm("add.cc.u32 %0, %0, %1;" : "+r"(even[0]) : "r"(odd[1]));
madc_n_rshift(odd, a+1, bi);
cmad_n(even, a, bi);
asm("addc.u32 %0, %0, 0;" : "+r"(odd[n-1]));
}
uint32_t mi = even[0] * M0;
cmad_n(odd, MOD+1, mi);
cmad_n(even, MOD, mi);
asm("addc.u32 %0, %0, 0;" : "+r"(odd[n-1]));
}
third to last row
cmad_n(odd, MOD+1, mi);
As you commented on the last line in the function cmad_n “”// return carry flag“”
But obviously the carry in the first “cmad_n” function will not be received by the second “cmad_n” function
Why is the overflow not considered?
If you think overflow must not happen here, please tell me why as well,THANKs
the rust code below is quite easy to understand. i.e reduce x_hi_hi, then x_hi_lo.
fn reduce128(x: u128) -> GoldilocksField {
let (x_lo, x_hi) = split(x); // This is a no-op
let x_hi_hi = x_hi >> 32;
let x_hi_lo = x_hi & EPSILON;
let (mut t0, borrow) = x_lo.overflowing_sub(x_hi_hi);
if borrow {
branch_hint(); // A borrow is exceedingly rare. It is faster to branch.
t0 -= EPSILON; // Cannot underflow.
}
let t1 = x_hi_lo * EPSILON;
let t2 = unsafe { add_no_canonicalize_trashing_input(t0, t1) };
GoldilocksField(t2)
}
however the below cuda inline assembly code of 128 reduction is not easy to understand; possible to provide some docs? thanks
1 inline void reduce(uint32_t temp[4])
2 {
3 uint32_t carry;
4 # if __CUDA_ARCH__ >= 700
5 asm("sub.cc.u32 %0, %0, %3; subc.cc.u32 %1, %1, %4; subc.u32 %2, 0, 0;"
6 : "+r"(temp[0]), "+r"(temp[1]), "=r"(carry)
7 : "r"(temp[2]), "r"(temp[3]));
8 asm("add.cc.u32 %0, %0, %2; addc.u32 %1, %1, %3;"
9 : "+r"(temp[1]), "+r"(carry)
10 : "r"(temp[2]), "r"(temp[3]));
11
12 asm("mad.lo.cc.u32 %0, %3, %4, %0; madc.hi.cc.u32 %1, %3, %4, %1; addc.u32 %2, 0, 0;"
13 : "+r"(temp[0]), "+r"(temp[1]), "=r"(temp[2])
14 : "r"(carry), "r"(gl64_device::W));
15 asm("mad.lo.cc.u32 %0, %2, %3, %0; madc.hi.u32 %1, %2, %3, %1;"
16 : "+r"(temp[0]), "+r"(temp[1])
17 : "r"(temp[2]), "r"(gl64_device::W));
18 # else
19 uint32_t b0, b1;
20 asm("add.cc.u32 %0, %2, %3; addc.u32 %1, 0, 0;"
21 : "=r"(b0), "=r"(b1)
22 : "r"(temp[2]), "r"(temp[3]));
23 asm("sub.cc.u32 %0, %0, %3; subc.cc.u32 %1, %1, %4; subc.u32 %2, 0, 0;"
24 : "+r"(temp[0]), "+r"(temp[1]), "=r"(carry)
25 : "r"(b0), "r"(b1));
26 asm("add.cc.u32 %0, %0, %2; addc.u32 %1, %1, %3;"
27 : "+r"(temp[0]), "+r"(temp[1])
28 : "r"(-carry), "r"(carry));
29 asm("add.cc.u32 %0, %0, %1; addc.u32 %1, 0, 0;"
30 : "+r"(temp[1]), "+r"(temp[2]));
31 # if __CUDA_ARCH__ >= 700
32 asm("mad.lo.cc.u32 %0, %2, %3, %0; madc.hi.u32 %1, %2, %3, %1;"
33 : "+r"(temp[0]), "+r"(temp[1])
34 : "r"(temp[2]), "r"(gl64_device::W));
35 # else
36 asm("add.cc.u32 %0, %0, %2; addc.u32 %1, %1, 0;"
37 : "+r"(temp[0]), "+r"(temp[1])
38 : "r"(-temp[2]));
39 # endif
40 # endif
41 asm("mov.b64 %0, {%1, %2};" : "=l"(val) : "r"(temp[0]), "r"(temp[1]));
42 }
Hi!
I have noticed that scalars
argument in the pippenger funciton has a type of const scalar_t*
Lines 44 to 45 in d967007
but the elements of scalars
will be convert to scalar_T
and write back inplace in the pippenger function.
Lines 158 to 176 in d967007
After the pippenger function executed, scalars are not the same as before. The type definition of scalars
may be a little misleading.
May I ask your guys if sppark plans to support halo2curves (github)[https://github.com/privacy-scaling-explorations/halo2curves]
There is a race condition between lines 267 and 278 in the msm/sort.cuh file. If a warp with a larger warpid executes line 278 before the warp with warpid=0 executes line 267, the calculation result will be wrong. In fact, on the ampere architecture, due to the scheduling strategy of the warp scheduler, it is impossible for warp (id>0) to execute line 278 before warp(id=0) executing line 278 and therefore the test always passes. However, there is indeed a logical error.
cargo check
I upgraded nvcc to 12.3 according to the hints in supranational/pasta-msm@8ccdc45 and got a compilation error on a simple cargo check
:
The warnings are several instances of the following warning :
warning: [email protected]: /usr/include/c++/13.2.1/bits/std_function.h: In instantiation of ‘std::function<_Res(_ArgTypes ...)>::_Requires<std::function<_Res(_ArgTypes ...)>::_Callable<_Functor>, std::function<_Res(_ArgTypes ...)>&> std::function<_Res(_ArgTypes ...)>::operator=(_Functor&&) [with _Functor = std::function<void()>&; _Res = void; _ArgTypes = {}; _R
equires<_Callable<_Functor>, std::function<_Res(_ArgTypes ...)>&> = std::function<void()>&; typename std::enable_if<(! std::is_same<typename std::remove_cv<typename std::remove_reference<_Tuple>::type>::type, std::function<_Res(_ArgTypes ...)> >::value), std::decay<_Func> >::type::type = std::function<void()>; typename std::enable_if<(! std::is_same<typename s
td::remove_cv<typename std::remove_reference<_Tuple>::type>::type, std::function<_Res(_ArgTypes ...)> >::value), std::decay<_Func> >::type = std::decay<std::function<void()>&>; typename std::remove_cv<typename std::remove_reference<_Tuple>::type>::type = std::function<void()>&; typename std::remove_reference<_Tuple>::type = std::function<void()>&]’:
warning: [email protected]: /home/huitseeker/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sppark-0.1.5/sppark/util/thread_pool_t.hpp:164:24: required from here
warning: [email protected]: /usr/include/c++/13.2.1/bits/std_function.h:534:44: error: call of overloaded ‘forward<std::function<void()>&>(std::function<void()>&)’ is ambiguous
warning: [email protected]: 534 | function(std::forward<_Functor>(__f)).swap(*this);
warning: [email protected]: | ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
warning: [email protected]: /usr/include/c++/13.2.1/bits/move.h:70:1: note: candidate: ‘constexpr _Tp&& std::forward(typename remove_reference<_Functor>::type&) [with _Tp = function<void()>&; typename remove_reference<_Functor>::type = function<void()>&]’
warning: [email protected]: 70 | forward(typename std::remove_reference<_Tp>::type& __t) noexcept
warning: [email protected]: | ^ ~~~
warning: [email protected]: /usr/include/c++/13.2.1/bits/move.h:82:1: note: candidate: ‘constexpr _Tp&& std::forward(typename remove_reference<_Functor>::type&&) [with _Tp = function<void()>&; typename remove_reference<_Functor>::type = function<void()>&]’
warning: [email protected]: 82 | forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
warning: [email protected]: | ^ ~~~
warning: [email protected]: /usr/include/c++/13.2.1/bits/vector.tcc: In instantiation of ‘std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::emplace_back(_Args&& ...) [with _Args = {std::thread&}; _Tp = std::thread; _Alloc = std::allocator<std::thread>; reference = std::thread&]’:
warning: [email protected]: /usr/include/c++/13.2.1/bits/stl_vector.h:1296:15: required from ‘void std::vector<_Tp, _Alloc>::push_back(value_type&&) [with _Tp = std::thread; _Alloc = std::allocator<std::thread>; value_type = std::thread]’
warning: [email protected]: /home/huitseeker/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sppark-0.1.5/sppark/util/thread_pool_t.hpp:79:20: required from here
...
Full output below.
huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» nvcc --version [10:04:07]
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:17:24_PDT_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0
huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» gcc --version [10:06:00]
gcc (GCC) 13.2.1 20230730
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
huitseeker@sleipnir➜huitseeker/tmp/pasta-msm(main)» uname -a [10:06:30]
Linux sleipnir 6.1.24-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 13 Apr 2023 17:22:35 +0000 x86_64 GNU/Linux
any documentation on how to understand the M0 in the Montgomery Reduction? Or which Montgomery Reduction algorithm is implemented? As of my shallow understanding of normal Mongt Implementation. There is no M0 involved. Maybe you are using another trick or algorithm implementation here?
I get an error like
= note: /usr/bin/ld: /tmp/blstmsmlinkerissue/target/debug/deps/libblst_msm-f154847bd3acdcfa.rlib(blst_msm-f154847bd3acdcfa.blst_msm.14511a1f-cgu.6.rcgu.o): in function `blst_msm::multi_scalar_mult':
/home/vmx/.cargo/git/checkouts/sppark-766e0a6fee265121/f939f36/poc/blst-cuda/src/lib.rs:49: undefined reference to `mult_pippenger'
collect2: error: ld returned 1 exit status
= note: some `extern` functions couldn't be found; some native libraries may need to be installed or have their path specified
= note: use the `-l` flag to specify native libraries to link
= note: use the `cargo:rustc-link-lib` directive to specify the native libraries to link with Cargo (see https://doc.rust-lang.org/cargo/reference/build-scripts.html#cargorustc-link-libkindname)
when trying to use blst-msm
as a dependency. Steps to reproduce (you can copy&paste that into your terminal and execute it directly):
cargo init blstmsmlinkerissue
cd blstmsmlinkerissue
cargo add blst
cargo add blst-msm --git https://github.com/supranational/sppark --features bls12_381
cat > src/main.rs << EOF
use blst::{blst_scalar, blst_p1_affine, blst_fp};
fn main() {
let bases = vec![blst_scalar{ b: [0; 32] }];
let exponents = vec![blst_p1_affine{ x: blst_fp { l: [0; 6] }, y: blst_fp { l: [0; 6]} }];
let point = blst_msm::multi_scalar_mult(&exponents, &bases);
println!("point: {:?}", point);
}
EOF
cargo build
I'm on Debian.
Instead of building for alone, WebGPU provides the a wrapper which would be compatible with vulkan, metal, and more. This also allows the library to be called from the web supporting more web based GPU acceleration, opening more edge processing setups.
Is this something that would be worthwhile to build for sppark?
I am pretty new to this, I was trying to setup a repo which has sppark as a dependency I guess and I am getting the below error
error: failed to run custom build command for `sppark v0.1.2 ([https://github.com/supranational/sppark?rev=48f6ac6#48f6ac60)`](https://github.com/supranational/sppark?rev=48f6ac6#48f6ac60))
Caused by:
process didn't exit successfully: `/home/aditya/aleo-prover/target/release/build/sppark-329937d95720d5f1/build-script-build` (exit status: 1)
--- stdout
cargo:rerun-if-changed=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/ec
cargo:rerun-if-changed=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/ff
cargo:rerun-if-changed=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/ntt
cargo:rerun-if-changed=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/msm
cargo:rerun-if-changed=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util
cargo:ROOT=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6
TARGET = Some("x86_64-unknown-linux-gnu")
OPT_LEVEL = Some("3")
HOST = Some("x86_64-unknown-linux-gnu")
cargo:rerun-if-env-changed=CXX_x86_64-unknown-linux-gnu
CXX_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=CXX_x86_64_unknown_linux_gnu
CXX_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_CXX
HOST_CXX = None
cargo:rerun-if-env-changed=CXX
CXX = None
cargo:rerun-if-env-changed=NVCC_x86_64-unknown-linux-gnu
NVCC_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=NVCC_x86_64_unknown_linux_gnu
NVCC_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_NVCC
HOST_NVCC = None
cargo:rerun-if-env-changed=NVCC
NVCC = None
cargo:rerun-if-env-changed=CXXFLAGS_x86_64-unknown-linux-gnu
CXXFLAGS_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=CXXFLAGS_x86_64_unknown_linux_gnu
CXXFLAGS_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_CXXFLAGS
HOST_CXXFLAGS = None
cargo:rerun-if-env-changed=CXXFLAGS
CXXFLAGS = None
cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
CRATE_CC_NO_DEFAULTS = None
DEBUG = Some("false")
CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
running: "nvcc" "-ccbin=c++" "-Xcompiler" "-O3" "-Xcompiler" "-ffunction-sections" "-Xcompiler" "-fdata-sections" "-Xcompiler" "-fPIC" "-m64" "-I" "/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6" "-Xcompiler" "-Wall" "-Xcompiler" "-Wextra" "-o" "/home/aditya/aleo-prover/target/release/build/sppark-ce7f4abd18d08224/out/src/lib.o" "-c" "--device-c" "src/lib.cpp"
cargo:warning=In file included from /home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh:9,
cargo:warning= from src/lib.cpp:2:
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh: In member function ‘void* stream_t::Dmalloc(size_t) const’:
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh:29:17: error: ‘cudaMallocAsync’ was not declared in this scope; did you mean ‘cudaMallocArray’?
cargo:warning= 29 | CUDA_OK(cudaMallocAsync(&d_ptr, sz, stream));
cargo:warning= | ^~~~~~~~~~~~~~~
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/exception.cuh:32:24: note: in definition of macro ‘CUDA_OK’
cargo:warning= 32 | cudaError_t code = expr; \
cargo:warning= | ^~~~
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh: In member function ‘void stream_t::Dfree(void*) const’:
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh:33:17: error: ‘cudaFreeAsync’ was not declared in this scope; did you mean ‘cudaFreeArray’?
cargo:warning= 33 | { CUDA_OK(cudaFreeAsync(d_ptr, stream)); }
cargo:warning= | ^~~~~~~~~~~~~
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/exception.cuh:32:24: note: in definition of macro ‘CUDA_OK’
cargo:warning= 32 | cudaError_t code = expr; \
cargo:warning= | ^~~~
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh: In constructor ‘pin_t<T>::pin_t(const T*, size_t)’:
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/gpu_t.cuh:230:59: error: ‘cudaHostRegisterReadOnly’ was not declared in this scope; did you mean ‘cudaHostRegisterDefault’?
cargo:warning= 230 | cudaHostRegisterPortable|cudaHostRegisterReadOnly));
cargo:warning= | ^~~~~~~~~~~~~~~~~~~~~~~~
cargo:warning=/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6/util/exception.cuh:32:24: note: in definition of macro ‘CUDA_OK’
cargo:warning= 32 | cudaError_t code = expr; \
cargo:warning= | ^~~~
exit status: 1
--- stderr
error occurred: Command "nvcc" "-ccbin=c++" "-Xcompiler" "-O3" "-Xcompiler" "-ffunction-sections" "-Xcompiler" "-fdata-sections" "-Xcompiler" "-fPIC" "-m64" "-I" "/home/aditya/.cargo/git/checkouts/sppark-766e0a6fee265121/48f6ac6" "-Xcompiler" "-Wall" "-Xcompiler" "-Wextra" "-o" "/home/aditya/aleo-prover/target/release/build/sppark-ce7f4abd18d08224/out/src/lib.o" "-c" "--device-c" "src/lib.cpp" with args "nvcc" did not execute successfully (status code exit status: 1).
warning: build failed, waiting for other jobs to finish...
for example, if GL64_PARTIALLY_REDUCED not defined.
let mut a: u64 = 0xffffffff_f0000fff;
let mut b: u64 = 0xffffffff_f0000fff;
the expected output should be 0x1e0001ffc. however, the actual result is 0xffffffffe0001ffe, after reduction, it is 0xe0001ffd, result is not matched
I am wondering how can I get the parameters of pasta_curve like
forward_roots_of_unity
,inverse_roots_of_unity
,domain_size_inverse
. Because I am based on the pasta curve.
This is more of a question than an issue, but I ask it here to prevent double efforts.
Are there plans to create a Rust binding for sppark, similar to the binding blst has? Or is the plan that Rust users would always create their own custom crate similar to the PoCs or pasta-msm
?
When I call mult_pippenger()
from Rust with <= 512 bases I get a segfault: signal: 11, SIGSEGV: invalid memory reference
. It would be cool if such limitations would be documented.
Hi,
I am trying to build ntt
part of sppark as independent C/C++ project for a project. However, the type declaration/definition for vec256 is no where to be found under sppark/
project. I even search into the codebase of [blst](https://github.com/supranational/blst/tree/master)
and still I couldn't find it either.
Building sppark from cargo is no problem at all. However, as I mentioned that I am trying to build the ntt portion independently as C/C++/Cuda project and the type declaration for vec256 is no where to be found which would enable me to be able to successfully build.
I would greatly appreciate your response
deps/sppark/ff/mont_t.cuh(729): error: expected a ">"
for (i = 0; i < n; i += 2) {
^
deps/sppark/ff/mont_t.cuh(735): error: expression must be a modifiable lvalue
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(735): error: an asm operand must have scalar type
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: expression must be a modifiable lvalue
: "+r"(even[n+i+1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: an asm operand must have scalar type
: "+r"(even[n+i+1]), "+r"(carry));
^
6 errors detected in the compilation of "pc2/cuda/pc2.cu".
deps/sppark/ff/mont_t.cuh(721): error: identifier "i" is undefined
even[i] = even[i+1] = 0;
^
deps/sppark/ff/mont_t.cuh(729): error: expected a ">"
for (i = 0; i < n; i += 2) {
^
deps/sppark/ff/mont_t.cuh(735): error: expression must be a modifiable lvalue
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(735): error: an asm operand must have scalar type
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: expression must be a modifiable lvalue
: "+r"(even[n+i+1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: an asm operand must have scalar type
: "+r"(even[n+i+1]), "+r"(carry));
^
6 errors detected in the compilation of "pc2/cuda/pc2.cu".
deps/sppark/ff/mont_t.cuh(721): error: identifier "i" is undefined
even[i] = even[i+1] = 0;
^
deps/sppark/ff/mont_t.cuh(729): error: expected a ">"
for (i = 0; i < n; i += 2) {
^
deps/sppark/ff/mont_t.cuh(735): error: expression must be a modifiable lvalue
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(735): error: an asm operand must have scalar type
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: expression must be a modifiable lvalue
: "+r"(even[n+i+1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: an asm operand must have scalar type
: "+r"(even[n+i+1]), "+r"(carry));
^
6 errors detected in the compilation of "pc2/cuda/pc2.cu".
deps/sppark/ff/mont_t.cuh(721): error: identifier "i" is undefined
even[i] = even[i+1] = 0;
^
deps/sppark/ff/mont_t.cuh(729): error: expected a ">"
for (i = 0; i < n; i += 2) {
^
deps/sppark/ff/mont_t.cuh(735): error: expression must be a modifiable lvalue
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(735): error: an asm operand must have scalar type
: "+r"(odd[n+i-1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: expression must be a modifiable lvalue
: "+r"(even[n+i+1]), "+r"(carry));
^
deps/sppark/ff/mont_t.cuh(742): error: an asm operand must have scalar type
: "+r"(even[n+i+1]), "+r"(carry));
^
6 errors detected in the compilation of "pc2/cuda/pc2.cu".
i am testing for gl64 NTT with log_n_size=17, under concurent environment.
i observed that the data copy from host to device ranges from 20us to 6ms. i think the underlying code does not utilise aync. the last line gpu.sync() will block CPU.
gpu.select();
size_t domain_size = (size_t)1 << lg_domain_size;
dev_ptr_t<fr_t> d_inout{domain_size, gpu};
gpu.HtoD(&d_inout[0], inout, domain_size);
NTT_internal(&d_inout[0], lg_domain_size, order, direction, type, gpu,
coset_ext_pow);
gpu.DtoH(inout, &d_inout[0], domain_size);
gpu.sync();
or, it would be better to provide a batch NTT function
I'm trying to combine MSM and NTT into a single kernel. I use the code from the POCs as an inspiration. The problem I'm currently hitting is, that if I include ec/jacobian_t.hpp
, then I cannot compile the NTT code anymore. See below for the errors I'm getting.
It's easy to reproduce, just add #include <ec/jacobian_t.hpp>
to the NTT POC.
Compiling ntt-cuda v0.1.0 (/home/vmx/src/pl/filecoin/upstream/sppark/poc/ntt-cuda)
The following warnings were emitted during compilation:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2144): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__hdiv") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2145): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__hdiv") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2259): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2260): error: calling a __device__ function("__float_as_uint") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2261): error: calling a __device__ function("__fsub_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2262): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2263): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2297): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2298): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2299): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2300): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/driver_abi.h(60): error: calling a __device__ function("__trap") from a __host__ function("load_env_reg") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(69): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(80): error: calling a __device__ function("__threadfence") from a __host__ function("sync_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(82): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAddE1?1?") from a __host__ function("sync_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(86): error: calling a __device__ function("__threadfence") from a __host__ function("sync_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(98): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(107): error: calling a __device__ function("__threadfence_block") from a __host__ function("atom_or_acq_rel_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(108): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069088atomicOrE1?1?") from a __host__ function("atom_or_acq_rel_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(118): error: calling a __device__ function("__threadfence_block") from a __host__ function("red_or_release_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(119): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069088atomicOrE1?1?") from a __host__ function("red_or_release_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(128): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAndE1?1?") from a __host__ function("red_and_relaxed_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(138): error: calling a __device__ function("__threadfence_block") from a __host__ function("red_and_release_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(139): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAndE1?1?") from a __host__ function("red_and_release_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(150): error: calling a __device__ function("__threadfence_block") from a __host__ function("ld_acquire_cta") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(175): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(187): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(198): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_last_releases") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(204): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690811__shfl_syncE1?1?1?1?") from a __host__ function("sync_warps_last_releases") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(229): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_arrive") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(255): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_reset") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(87): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(283): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690811__shfl_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(292): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690816__shfl_down_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(301): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__shfl_up_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(310): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690815__shfl_xor_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(418): error: calling a __device__ function("__clusterGridDimInClusters()") from a __host__ function("dim_clusters") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(427): error: calling a __device__ function("__clusterIdx()") from a __host__ function("cluster_index") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(453): error: calling a __device__ function("cudaCGGetIntrinsicHandle") from a __host__ function("get_intrinsic_handle") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(458): error: calling a __device__ function("cudaCGSynchronize") from a __host__ function("sync") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(464): error: calling a __device__ function("cudaCGGetSize") from a __host__ function("size") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(471): error: calling a __device__ function("cudaCGGetRank") from a __host__ function("thread_rank") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(478): error: calling a __device__ function("cudaCGGetRank") from a __host__ function("grid_rank") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(485): error: calling a __device__ function("cudaCGGetSize") from a __host__ function("num_grids") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(501): error: a static "__constant__" variable declaration is not allowed inside a host function body
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(521): error: calling a __device__ function("__clusterDimIsSpecified()") from a __host__ function("isReal") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(526): error: calling a __device__ function("__cluster_barrier_arrive()") from a __host__ function("barrier_arrive") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(531): error: calling a __device__ function("__cluster_barrier_wait()") from a __host__ function("barrier_wait") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(542): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690827__cluster_query_shared_rankE1?") from a __host__ function("query_shared_rank") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(553): error: calling a __device__ function("__clusterRelativeBlockIdx()") from a __host__ function("block_index") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(558): error: calling a __device__ function("__clusterRelativeBlockRank()") from a __host__ function("block_rank") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(568): error: calling a __device__ function("__clusterDim()") from a __host__ function("dim_blocks") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(573): error: calling a __device__ function("__clusterSizeInBlocks()") from a __host__ function("num_blocks") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(616): error: calling a __device__ function("__trap") from a __host__ function("abort") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(262): error: calling a __device__ function("__trap") from a __host__ function("sync") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(340): error: calling a __device__ function("__trap") from a __host__ function("sync") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(598): error: calling a __device__ function("__popc") from a __host__ function("_get_tiled_threads") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(749): error: calling a __device__ function("__popc") from a __host__ function("coalesced_group") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(772): error: calling a __device__ function("__popc") from a __host__ function("thread_rank") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(786): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(792): error: calling a __device__ function("__ffs") from a __host__ function("shfl") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(793): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(806): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl_down") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(822): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl_up") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(862): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("any") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(865): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("all") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(869): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("ballot") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(871): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("ballot") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(903): error: calling a __device__ function("__activemask()") from a __host__ function("coalesced_threads") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(1410): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__any_syncE1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(1421): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__all_syncE1?1?") from a __host__ function("operator()") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(66): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("_binary_partition") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(85): error: calling a __device__ function("__ffs") from a __host__ function("_labeled_partition") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(93): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("_labeled_partition") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(94): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("_labeled_partition") is not allowed
warning:
warning: /usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(96): error: calling a __device__ function("__popc") from a __host__ function("_labeled_partition") is not allowed
warning:
warning: /home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(14): error: calling a __device__ function("__brev") from a __host__ function("bit_rev") is not allowed
warning:
warning: /home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(16): error: calling a __device__ function("__brevll") from a __host__ function("bit_rev") is not allowed
warning:
warning: /home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(32): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690815__shfl_xor_syncE1?1?1?1?") from a __host__ function("shfl_bfly") is not allowed
warning:
warning: 79 errors detected in the compilation of "cuda/ntt_api.cu".
error: failed to run custom build command for `ntt-cuda v0.1.0 (/home/vmx/src/pl/filecoin/upstream/sppark/poc/ntt-cuda)`
Caused by:
process didn't exit successfully: `/home/vmx/src/pl/filecoin/upstream/sppark/poc/ntt-cuda/target/debug/build/ntt-cuda-e41d1a0ded9a429a/build-script-build` (exit status: 1)
--- stdout
TARGET = Some("x86_64-unknown-linux-gnu")
OPT_LEVEL = Some("0")
HOST = Some("x86_64-unknown-linux-gnu")
cargo:rerun-if-env-changed=CXX_x86_64-unknown-linux-gnu
CXX_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=CXX_x86_64_unknown_linux_gnu
CXX_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_CXX
HOST_CXX = None
cargo:rerun-if-env-changed=CXX
CXX = None
cargo:rerun-if-env-changed=NVCC_x86_64-unknown-linux-gnu
NVCC_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=NVCC_x86_64_unknown_linux_gnu
NVCC_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_NVCC
HOST_NVCC = None
cargo:rerun-if-env-changed=NVCC
NVCC = None
cargo:rerun-if-env-changed=CXXFLAGS_x86_64-unknown-linux-gnu
CXXFLAGS_x86_64-unknown-linux-gnu = None
cargo:rerun-if-env-changed=CXXFLAGS_x86_64_unknown_linux_gnu
CXXFLAGS_x86_64_unknown_linux_gnu = None
cargo:rerun-if-env-changed=HOST_CXXFLAGS
HOST_CXXFLAGS = None
cargo:rerun-if-env-changed=CXXFLAGS
CXXFLAGS = None
cargo:rerun-if-env-changed=CRATE_CC_NO_DEFAULTS
CRATE_CC_NO_DEFAULTS = None
DEBUG = Some("true")
CARGO_CFG_TARGET_FEATURE = Some("fxsr,sse,sse2")
running: "nvcc" "-ccbin=c++" "-Xcompiler" "-O0" "-Xcompiler" "-ffunction-sections" "-Xcompiler" "-fdata-sections" "-Xcompiler" "-fPIC" "-G" "-Xcompiler" "-gdwarf-4" "-Xcompiler" "-fno-omit-frame-pointer" "-m64" "-I" "/home/vmx/.cargo/registry/src/github.com-1ecc6299db9ec823/blst-0.3.10/blst/src" "-I" "/home/vmx/src/pl/filecoin/upstream/sppark" "-Xcompiler" "-Wall" "-Xcompiler" "-Wextra" "-arch=sm_70" "-Xcompiler" "-Wno-unused-parameter" "-Xcompiler" "-Wno-subobject-linkage" "-DTAKE_RESPONSIBILITY_FOR_ERROR_MESSAGE" "-DFEATURE_BLS12_381" "-o" "/home/vmx/src/pl/filecoin/upstream/sppark/poc/ntt-cuda/target/debug/build/ntt-cuda-5f054fafddbbbff8/out/cuda/ntt_api.o" "-c" "cuda/ntt_api.cu"
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2144): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__hdiv") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2145): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__hdiv") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2259): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2260): error: calling a __device__ function("__float_as_uint") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2261): error: calling a __device__ function("__fsub_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2262): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2263): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_trig_reduction_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2297): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2298): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2299): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp(2300): error: calling a __device__ function("__fmaf_rn") from a __host__ function("__internal_sin_cos_kernel") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/driver_abi.h(60): error: calling a __device__ function("__trap") from a __host__ function("load_env_reg") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(69): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(80): error: calling a __device__ function("__threadfence") from a __host__ function("sync_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(82): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAddE1?1?") from a __host__ function("sync_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(86): error: calling a __device__ function("__threadfence") from a __host__ function("sync_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(98): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(107): error: calling a __device__ function("__threadfence_block") from a __host__ function("atom_or_acq_rel_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(108): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069088atomicOrE1?1?") from a __host__ function("atom_or_acq_rel_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(118): error: calling a __device__ function("__threadfence_block") from a __host__ function("red_or_release_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(119): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069088atomicOrE1?1?") from a __host__ function("red_or_release_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(128): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAndE1?1?") from a __host__ function("red_and_relaxed_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(138): error: calling a __device__ function("__threadfence_block") from a __host__ function("red_and_release_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(139): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069089atomicAndE1?1?") from a __host__ function("red_and_release_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(150): error: calling a __device__ function("__threadfence_block") from a __host__ function("ld_acquire_cta") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(175): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(187): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(198): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_last_releases") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(204): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690811__shfl_syncE1?1?1?1?") from a __host__ function("sync_warps_last_releases") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(229): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_arrive") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/sync.h(255): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync_warps_reset") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(87): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__barrier_syncE1?") from a __host__ function("sync") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(283): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690811__shfl_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(292): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690816__shfl_down_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(301): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690814__shfl_up_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(310): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690815__shfl_xor_syncE1?1?1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(418): error: calling a __device__ function("__clusterGridDimInClusters()") from a __host__ function("dim_clusters") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(427): error: calling a __device__ function("__clusterIdx()") from a __host__ function("cluster_index") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(453): error: calling a __device__ function("cudaCGGetIntrinsicHandle") from a __host__ function("get_intrinsic_handle") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(458): error: calling a __device__ function("cudaCGSynchronize") from a __host__ function("sync") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(464): error: calling a __device__ function("cudaCGGetSize") from a __host__ function("size") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(471): error: calling a __device__ function("cudaCGGetRank") from a __host__ function("thread_rank") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(478): error: calling a __device__ function("cudaCGGetRank") from a __host__ function("grid_rank") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(485): error: calling a __device__ function("cudaCGGetSize") from a __host__ function("num_grids") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(501): error: a static "__constant__" variable declaration is not allowed inside a host function body
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(521): error: calling a __device__ function("__clusterDimIsSpecified()") from a __host__ function("isReal") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(526): error: calling a __device__ function("__cluster_barrier_arrive()") from a __host__ function("barrier_arrive") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(531): error: calling a __device__ function("__cluster_barrier_wait()") from a __host__ function("barrier_wait") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(542): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690827__cluster_query_shared_rankE1?") from a __host__ function("query_shared_rank") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(553): error: calling a __device__ function("__clusterRelativeBlockIdx()") from a __host__ function("block_index") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(558): error: calling a __device__ function("__clusterRelativeBlockRank()") from a __host__ function("block_rank") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(568): error: calling a __device__ function("__clusterDim()") from a __host__ function("dim_blocks") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(573): error: calling a __device__ function("__clusterSizeInBlocks()") from a __host__ function("num_blocks") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/helpers.h(616): error: calling a __device__ function("__trap") from a __host__ function("abort") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(262): error: calling a __device__ function("__trap") from a __host__ function("sync") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(340): error: calling a __device__ function("__trap") from a __host__ function("sync") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(598): error: calling a __device__ function("__popc") from a __host__ function("_get_tiled_threads") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(749): error: calling a __device__ function("__popc") from a __host__ function("coalesced_group") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(772): error: calling a __device__ function("__popc") from a __host__ function("thread_rank") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(786): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__syncwarpE1?") from a __host__ function("sync") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(792): error: calling a __device__ function("__ffs") from a __host__ function("shfl") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(793): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(806): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl_down") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(822): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("shfl_up") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(862): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("any") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(865): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("all") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(869): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("ballot") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(871): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("ballot") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(903): error: calling a __device__ function("__activemask()") from a __host__ function("coalesced_threads") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(1410): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__any_syncE1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups.h(1421): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690810__all_syncE1?1?") from a __host__ function("operator()") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(66): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("_binary_partition") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(85): error: calling a __device__ function("__ffs") from a __host__ function("_labeled_partition") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(93): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690813__ballot_syncE1?1?") from a __host__ function("_labeled_partition") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(94): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc069085__fnsE1?1?1?") from a __host__ function("_labeled_partition") is not allowed
cargo:warning=
cargo:warning=/usr/local/cuda/bin/../targets/x86_64-linux/include/cooperative_groups/details/partitioning.h(96): error: calling a __device__ function("__popc") from a __host__ function("_labeled_partition") is not allowed
cargo:warning=
cargo:warning=/home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(14): error: calling a __device__ function("__brev") from a __host__ function("bit_rev") is not allowed
cargo:warning=
cargo:warning=/home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(16): error: calling a __device__ function("__brevll") from a __host__ function("bit_rev") is not allowed
cargo:warning=
cargo:warning=/home/vmx/src/pl/filecoin/upstream/sppark/ntt/kernels.cu(32): error: calling a __device__ function("_ZN41_INTERNAL_f2530303_10_ntt_api_cu_fcc0690815__shfl_xor_syncE1?1?1?1?") from a __host__ function("shfl_bfly") is not allowed
cargo:warning=
cargo:warning=79 errors detected in the compilation of "cuda/ntt_api.cu".
exit status: 2
--- stderr
error occurred: Command "nvcc" "-ccbin=c++" "-Xcompiler" "-O0" "-Xcompiler" "-ffunction-sections" "-Xcompiler" "-fdata-sections" "-Xcompiler" "-fPIC" "-G" "-Xcompiler" "-gdwarf-4" "-Xcompiler" "-fno-omit-frame-pointer" "-m64" "-I" "/home/vmx/.cargo/registry/src/github.com-1ecc6299db9ec823/blst-0.3.10/blst/src" "-I" "/home/vmx/src/pl/filecoin/upstream/sppark" "-Xcompiler" "-Wall" "-Xcompiler" "-Wextra" "-arch=sm_70" "-Xcompiler" "-Wno-unused-parameter" "-Xcompiler" "-Wno-subobject-linkage" "-DTAKE_RESPONSIBILITY_FOR_ERROR_MESSAGE" "-DFEATURE_BLS12_381" "-o" "/home/vmx/src/pl/filecoin/upstream/sppark/poc/ntt-cuda/target/debug/build/ntt-cuda-5f054fafddbbbff8/out/cuda/ntt_api.o" "-c" "cuda/ntt_api.cu" with args "nvcc" did not execute successfully (status code exit status: 2).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.