Git Product home page Git Product logo

jcuda's Introduction

jcuda

JCuda - Java bindings for CUDA

Refer to jcuda-main for further information and build instructions.

jcuda's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jcuda's Issues

Where is Jcudpp?

There is documentation about this library on the jcuda website but I cannot find it's source code or a way to add it to my project. Has it been discontinued?

Thread safety of nio ByteBuffer

JCuda Pointer class uses nio ByteBuffer to wrap java arrays. Unfortunately nio ByteBuffer is neither thread safe nor shields memory from Garbage collector. This utility class belongs to nio package and is usable only in IO context. The correct solution how to allocate pined memory in Java outside the heap is discussed in Memory allocation outside the heap. I afraid that only correct solution involves JNI. In this case allocated memory can be both thread safe and not managed by GC. Memory allocated in such way can be filled by java code and transfered to CUDA like in C programs. If accessed via synchronized methods it is thread safe too. Unfortunately I'm very rusted C programmer and can't take this topic for implementation.

jar missing

Dear 👍
I used the latest version, but the error was reported when running. The specific error was as follows:

` ava.lang.UnsatisfiedLinkError: Error while loading native library "JCudaDriver-
0.9.0d-linux-x86_64"
Operating system name: Linux
Architecture : amd64
Architecture bit size: 64
---(start of nested stack traces)---
Stack trace from the attempt to load the library as a file:
java.lang.UnsatisfiedLinkError: no JCudaDriver-0.9.0d-linux-x86_64 in java.libra
ry.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at jcuda.LibUtils.loadLibrary(LibUtils.java:143)
at jcuda.driver.JCudaDriver.(JCudaDriver.java:296)
at com.alibaba.datax.common.jcuda.JCUDAOperation.(JCUDAOperation
.java:30)
at com.alibaba.datax.common.statistics.VMInfo.(VMInfo.java:26)
at com.alibaba.datax.common.statistics.VMInfo.getVmInfo(VMInfo.java:36)
at com.alibaba.datax.core.Engine.entry(Engine.java:160)
at com.alibaba.datax.core.Engine.main(Engine.java:204)
Stack trace from the attempt to load the library as a resource:
java.lang.UnsatisfiedLinkError: /tmp/libJCudaDriver-0.9.0d-linux-x86_64.so

: libc
uda.so

.1: 鏃犳硶鎵撳紑?叡浜璞℃枃浠? 娌℃湁閭d釜鏂囦欢鎴栫洰褰?
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at jcuda.LibUtils.loadLibraryResource(LibUtils.java:260)
at jcuda.LibUtils.loadLibrary(LibUtils.java:158)
at jcuda.driver.JCudaDriver.(JCudaDriver.java:296)
at com.alibaba.datax.common.jcuda.JCUDAOperation.(JCUDAOperation
.java:30)
at com.alibaba.datax.common.statistics.VMInfo.(VMInfo.java:26)
at com.alibaba.datax.common.statistics.VMInfo.getVmInfo(VMInfo.java:36)
at com.alibaba.datax.core.Engine.entry(Engine.java:160)
at com.alibaba.datax.core.Engine.main(Engine.java:204)
---(end of nested stack traces)---

    at jcuda.LibUtils.loadLibrary(LibUtils.java:193)
    at jcuda.driver.JCudaDriver.<clinit>(JCudaDriver.java:296)
    at com.alibaba.datax.common.jcuda.JCUDAOperation.<clinit>(JCUDAOperation

.java:30)
at com.alibaba.datax.common.statistics.VMInfo.(VMInfo.java:26)
at com.alibaba.datax.common.statistics.VMInfo.getVmInfo(VMInfo.java:36)
at com.alibaba.datax.core.Engine.entry(Engine.java:160)
at com.alibaba.datax.core.Engine.main(Engine.java:204)
#`
However, I didn't find the jar online
i need help !

String representations for flags are not always created properly

The classes that correspond to an enum in C consist of a set of CONSTANTS = 123 in JCuda. They also have a stringFor method that receives a value, and returns a string representation of the constant name.

However, there are some enum types in CUDA that are not really enumerations, but flags. The constant values are then usually 0x01, 0x02, 0x04 and so on. These values can be ORed together to obtain, for example, information about a set of supported features.

In some cases, this is already handled properly: The stringFor method creates a string containing all flags (for example,

).

But in some cases, the stringFor just expects a single constant value (for example,

).

The "flag enums" should consistently return a string containing all flags.

This information cannot be derived directly from the source code. A heuristic that could help to automate this would be: Check whether for a certain enum type, the constants have a "pairwise disjoint" bit representation (i.e. (Cx & Cy) == 0). But this would still require a review, for special constants like ALL=0xFF or READ=1,WRITE=2,READ_WRITE=3.

nvrtc not found on linux

While generating the makefile with cmake on linux, I have the following error.

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_nvrtc_LIBRARY
linked by target "JNvrtc" in directory /home/nicblanc/JCuda/jcuda/JNvrtcJNI

I link the nvrtc directory manually in the CMakeLists.txt file to bypass this issue.

Add support for stream callbacks

The stream callbacks (see CUDA documentation) are one of the main functionalities that are not yet supported in JCuda. Calling a function to set up a stream callback will cause an UnsupportedOperationException.

Although there seemed to be not much demand for this functionality in the past, it is a possibly important feature for certain forms of synchronization. The fact that it was introduced in CUDA 5.0 in 2012 and is still not supported by JCuda is embarassing.

The goal of this issue is to trigger the process of actually implementing them, track the progress and discuss possible related issues.

JCuda maven build fails due to Maven javadoc plugin

With the recent javadoc plugin Jcuda maven build fails due to fake issue - missed javdoc files
Can be easily solved - using Xdoclint options:

true
-Xdoclint:none

-Xdoclint:none
-Xdoclint:-missing


@jcuda , please update pom.xml and merge with release branch

Can we rename the development branch to main?

Hi Marco,

Some of the projects refer code from this repo.
Is it possible to change the branch name to something
that you prefer? (perhaps main!)

Thanks a lot and Regards,
Janardhan

JCuda JNvrtc-11.2.0-windows-x86_64.dll dependencies

I am on Win7 Ultimate x64 SP2, trying to run some JCuda examples (11.2.0) in NetBeans, but it seems some files needs some other dependency files, but there is no mentioning about what they suppose to be.

I did try software Dependency Walker, that shows me the file JNvrtc-11.2.0-windows-x86_64.dll dependency are just two other dlls, namely nvrtc64_102_0.dll from CUDA Toolkit 10.2, and system Kernel32.dll.

I have installed CUDA Toolkit 10.2, add its path to PATH environment system variable (so it can find required CUDA files), also all toher JCuda related/required JARs are included in the project.

I also have added correct option to VM Options for the project where to find those native .DLLs (correct relative path, files are loaded - tested, no problem here): -Djava.library.path="lib"

Yet it still complains about JNvrtc-11.2.0-windows-x86_64.dll is missing some other dependency file(s), see the error output window from NetBeans below.

run:
Initializing
Exception in thread "main" java.lang.UnsatisfiedLinkError: Error while loading native library "JNvrtc-11.2.0-windows-x86_64"
Operating system name: Windows 7
Architecture         : amd64
Architecture bit size: 64
---(start of nested stack traces)---
Stack trace from the attempt to load the library as a file:
java.lang.UnsatisfiedLinkError: Z:\_JAVA_\TEST (JCuda examples)\lib\JNvrtc-11.2.0-windows-x86_64.dll: Can't find dependent libraries
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1857)
    at java.lang.Runtime.loadLibrary0(Runtime.java:870)
    at java.lang.System.loadLibrary(System.java:1122)
    at jcuda.LibUtils.loadLibrary(LibUtils.java:168)
    at jcuda.LibUtilsCuda.loadLibrary(LibUtilsCuda.java:68)
    at jcuda.nvrtc.JNvrtc.<clinit>(JNvrtc.java:60)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.initialize(JCudaDriverBasicGraphExample.java:226)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.main(JCudaDriverBasicGraphExample.java:92)
Stack trace from the attempt to load the library as a resource:
java.lang.UnsatisfiedLinkError: C:\Users\1\AppData\Local\Temp\JNvrtc-11.2.0-windows-x86_64.dll: Can't find dependent libraries
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
    at java.lang.Runtime.load0(Runtime.java:809)
    at java.lang.System.load(System.java:1086)
    at jcuda.LibUtils.loadLibraryResource(LibUtils.java:285)
    at jcuda.LibUtils.loadLibrary(LibUtils.java:183)
    at jcuda.LibUtilsCuda.loadLibrary(LibUtilsCuda.java:68)
    at jcuda.nvrtc.JNvrtc.<clinit>(JNvrtc.java:60)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.initialize(JCudaDriverBasicGraphExample.java:226)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.main(JCudaDriverBasicGraphExample.java:92)
---(end of nested stack traces)---

    at jcuda.LibUtils.loadLibrary(LibUtils.java:218)
    at jcuda.LibUtilsCuda.loadLibrary(LibUtilsCuda.java:68)
    at jcuda.nvrtc.JNvrtc.<clinit>(JNvrtc.java:60)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.initialize(JCudaDriverBasicGraphExample.java:226)
    at jcuda.driver.samples.JCudaDriverBasicGraphExample.main(JCudaDriverBasicGraphExample.java:92)
C:\Users\1\AppData\Local\NetBeans\Cache\12.4\executor-snippets\run.xml:111: The following error occurred while executing this line:
C:\Users\1\AppData\Local\NetBeans\Cache\12.4\executor-snippets\run.xml:94: Java returned: 1
BUILD FAILED (total time: 0 seconds)

Can anyone explain to me what are those supposed other required dependency files that the file JCuda JNvrtc-11.2.0-windows-x86_64.dll needs?

Post with images can be found at StackOverflow

JCuda Pointer vs CudaMallocHost

It looks like iplemented in JCuda memory allocation mechanism contradicts native Cuda Host memory allocation mechanism. It also will contradict Unified memory architecture. Host memory can be allocated using Cuda Malloc Host and access to allocated memory should be provided via Pointer abstraction primitievs:

Pointer constructor allocates memory
Pointer.readAs(offsetInTypeUnit,sizeInTypeUnit) reads data directly
Pointer.writeAs(offsetInTypeUnit,sizeinTypeUnit) writes data into allocatted memory directly
etc.
Every Thread pool executor or Function run can claim several memory regions for exclusive(?) use

Feature request: Buffer API from Jcuda Pointer

JCuda pointer to Host allocated memory in most cases backed by Java NIO buffer. But it stripts out Buffer API. It doesn't provide any Pointer API which is not usable in Java anywhere. Is it possible to return extremely powerfull and usefull Buffer API back from Memory pointer? It will enable free writing anf reading from/to Memory instead of static allocation of potentially huge arrays ahead.
What about sparse matrix which can't be implemented as java array?
P.S. Tensor java object can be borrowed from Tensorflow Java API - it is most generic data structure in the world.

Cuda 12

Howdy,
I am on to Cuda 12, and more fun awaits.

From Stack overflow:
CUDA 12.0 [dropped support](https://forums.developer.nvidia.com/t/cuda-12-0-still-support-for-texture-reference-support-for-pascal-architecture-warp-synchronous-programming/237284/1) for legacy texture references. Therefore, any code that uses legacy texture references can no longer be properly compiled with CUDA 12.0 or beyond.

Legacy texture reference usage has been deprecated for some time now.

As indicated in the comments, by reverting to CUDA 11.x where legacy texture references are still supported (albeit deprecated) you won't run into this issue.

The other option may happen some day when OpenCV converts usage of legacy texture references to [texture object methods](https://developer.nvidia.com/blog/cuda-pro-tip-kepler-texture-objects-improve-performance-and-flexibility/). In that case, it may then be possible to use CUDA 12.0 or a newer CUDA toolkit to compile OpenCV/CUDA functionality.

There is no work around to somehow allow texture reference usage to be compiled properly with CUDA 12.0 and beyond.

Likewise, this limitation is not unique or specific to OpenCV. Any CUDA code that uses texture references can no longer be compiled properly with CUDA 12.0 and beyond. The options are to refactor that code with texture object usage instead, or revert to a previous CUDA toolkit that still has the deprecated support for texture reference usage.

Which is odd because the specs still reference maximum limits for texture reference for the various GPU's. do they not?

At any rate, this NVidia post explains how to migrate to texture objects instead, which seems their preferred path:
https://developer.nvidia.com/blog/cuda-pro-tip-kepler-texture-objects-improve-performance-and-flexibility/

cuMemAlloc & cuMemcpyHtoD doesn't work

I'm trying to get a programm for matrix multiplication, addition and with sigmoid-function to work.

I made a test, where i set everything up and run it successfuly.
Then I implemented that into the actual programm and it stopped working.
I looked at everything and both functions cuMemAlloc and cuMemcpyHtoD return 201 and the CUdeviceptr that they should define has still an empty pointer.

When i run the programm it doesn't give an error, but the function:

float[] layer(float[] input, float[] weights, float[] baises, int[] wd) {

   	// Allocate the device input data, and copy the
   	// host input data to the device
   	CUdeviceptr inputInput = new CUdeviceptr();
   	cuMemAlloc(inputInput,input.length * Sizeof.FLOAT);
   	cuMemcpyHtoD(inputInput, Pointer.to(input), input.length * Sizeof.FLOAT);

   	CUdeviceptr inputWeights = new CUdeviceptr();
   	cuMemAlloc(inputWeights, weights.length * Sizeof.FLOAT);
   	cuMemcpyHtoD(inputWeights, Pointer.to(weights), weights.length * Sizeof.FLOAT);

   	CUdeviceptr inputBaises = new CUdeviceptr();
   	cuMemAlloc(inputBaises, baises.length * Sizeof.FLOAT);
   	cuMemcpyHtoD(inputBaises, Pointer.to(baises), baises.length * Sizeof.FLOAT);

   	// Allocate device output memory
   	CUdeviceptr deviceOutput = new CUdeviceptr();
   	cuMemAlloc(deviceOutput, baises.length * Sizeof.FLOAT);

   	// Set up the kernel parameters: A pointer to an array
   	// of pointers which point to the actual values.
   	Pointer kernelParameters = Pointer.to(
                               Pointer.to(new int[] { wd[0] }), 
                               Pointer.to(new int[] { wd[1] }),
   			Pointer.to(inputInput), 
                               Pointer.to(inputWeights), 
                               Pointer.to(inputBaises), 
                               Pointer.to(deviceOutput));

   	// Call the kernel function.
   	int blockSizeX = 1000;
   	int gridSizeX = (int) Math.ceil((double) baises.length / blockSizeX);
   	cuLaunchKernel(AIEngine.LayerFunction, 
                               gridSizeX, 1, 1, // Grid dimension
   			blockSizeX, 1, 1, // Block dimension
   			0, null, // Shared memory size and stream
   			kernelParameters, null // Kernel- and extra parameters
   	);

   	float hostOutput[] = new float[baises.length];
   	cuMemcpyDtoH(Pointer.to(hostOutput), deviceOutput, baises.length * Sizeof.FLOAT);

   	// Clean up.
   	cuMemFree(inputInput);
   	cuMemFree(inputWeights);
   	cuMemFree(inputBaises);
   	cuMemFree(deviceOutput);
   	return hostOutput;
   }

gives an empty array.

The kernel gets loaded in another class, but I already tried it in the same class and it didn't work.

thank you in advance

jcuda library absent

Hello,
I created a project (netbeans) and added in my pom.xml file all jcuda dependencies. That's really great, thank you for this development.
However, all the library are compiled (jcublas, jcufft ...) except "jcuda". Due to that, I can not run my program (JCudaRuntime library not found).
Did I missed something ?
cudaquestion
Thank you for your response
Ben

Allow "null" to be passed to cuCtxSetCurrent

Currently,

cuCtxSetCurrent(null);

throws a NullPointerException, although it should be possible according to the documentation. This is not critical, as the desired behavior can be emulated by calling

cuCtxSetCurrent(new CUcontext());

instead, but should be fixed with the next release.

EDIT: The same would actually apply to other functions as well. It could be reasonable to simply allow cudaFree(null) and related functions to accept null pointers.

Jetson Support (CUDA 8)

We are trying to build JCuda on a Jetson TX1 running CUDA 8. Do you have any advice?

We were following building instructions from here.

And we had to make the changes from this commit also, since it hasn't made it's way to the version-0.8.0 tag yet.

The first step of compilation (compiling the *.so) works, but when compiling the jars we get errors during testing.

The error we get is:

jcuda.CudaException: Could not create ptx file

I've attached the full building / error messages.
JcudaMvnErrorPTX.txt

Thanks for your help!

HIP support

Sorry if this is not the best place to discuss this, but here goes:

AMD has recently pivoted in its GPGPU strategy. They have a new open-source software stack, ROCm, for their Fiji (R9 Fury) GPUs and future products, and have seemingly abandoned OpenCL.

https://radeonopencompute.github.io/

ROCm exposes a number of alternative APIs. They have the low-level ROCR (C host API and assembly kernel language) (sample), the high-level HC (C++ host API and C++ kernel language), and the CUDA-emulating HIP API (C host API and C++ kernel language).

Here is a comparison table of syntax between the various APIs: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/blob/master/docs/markdown/hip_terms.md

The HIP API is particularly relevant for this project. It's basically a platform-independent CUDA Driver API that is compatible with both AMD's stack and NVIDIA's stack. This is potentially a very exciting development.

HIP page: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP
HIP blog: http://gpuopen.com/tag/hip/

According to AMD:

  • HIP is very thin and has little or no performance impact over coding directly in CUDA or hcc "HC" mode.
  • HIP allows coding in a single-source C++ programming language including features such as templates, C++11 lambdas, classes, namespaces, and more.
  • HIP allows developers to use the "best" development environment and tools on each target platform.
  • The "hipify" tool automatically converts source from CUDA to HIP.
  • Developers can specialize for the platform (CUDA or hcc) to tune for performance or handle tricky cases

It would be very cool if there was perhaps a JHIP library for Java that would allow me to write HIP code. To be honest, I have not tried HIP yet and can't comment on how well it works in practice. However, I wanted to put it on your radar.

Note, I am NOT affiliated with AMD. I just like competition :)

on JDK17, performance slowdown significantly after a few days

setup env: linux, jcuda10.2, cuda11.4, JDK17

The typical process routine of our serivce includes:

  1. host to device copy
  2. lauch kernel
  3. device to host copy
  4. cuStreamAddCallback
  5. trigger host function to deal with results already copied into host buffer

all happens in streams concurrently.

We run this service with JDK8 without any issue, but when trying JDK17, things goes quite weird. After one or two days since service restarted, the performance becomes very slow, maybe 10x slower.

We have tried do some profiling, and find this stack eats up quite a lot CPU cycles:
cuStreamAddCallback_NativeCallback(CUstream_st*, cudaError_enum, void*)-->jni_DetachCurrentThread-->ObjectSynchronizer::release_monitors_owned_by_thread

Asking help for this performance issue, Thanks

Linux build failure (JCuda 9)

I'm on freshly updated Arch Linux. Installed Cuda 9.1.85_387.26 (the latest as of today). Relinked gcc to use gcc-6 (that's the last that cuda supports).

Problem 1: JCuda cmake script doesn't find libnvrtc. It is similar to #3, with the twist that libnvrtc is present in cuda's lib directory, along with the rest of its libraries.

Solution? 1: I edited CMakeCache.txt and replaced CUDA_nvrtc_LIBRARY-NOTFOUND with the actual location of libnvrtc.so (/usr/local/cuda/lib64/libnvrtc.so). After that, it appears that nvrtc is available, since I am able to continue with the build.

Problem 2: After successfully building jcuda-parent, I got the following when i run mvn clean install in jcuda-main:

[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] JCuda
[INFO] jcuda-common
[INFO] jcuda-natives
[INFO] jcuda
[INFO] jcublas-natives
[INFO] jcublas
[INFO] jcufft-natives
[INFO] jcufft
[INFO] jcurand-natives
[INFO] jcurand
[INFO] jcusparse-natives
[INFO] jcusparse
[INFO] jcusolver-natives
[INFO] jcusolver
[INFO] jnvgraph-natives
[INFO] jnvgraph
[INFO] jcuda-main
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building JCuda 0.9.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jcuda-parent ---
[INFO] 
[INFO] >>> maven-source-plugin:2.1.2:jar (attach-sources) > generate-sources @ jcuda-parent >>>
[INFO] 
[INFO] <<< maven-source-plugin:2.1.2:jar (attach-sources) < generate-sources @ jcuda-parent <<<
[INFO] 
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar (attach-sources) @ jcuda-parent ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.10.1:jar (attach-javadocs) @ jcuda-parent ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable package
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ jcuda-parent ---
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda-parent/pom.xml to /home/dragan/.m2/repository/org/jcuda/jcuda-parent/0.9.0/jcuda-parent-0.9.0.pom
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building jcuda-common 0.9.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jcuda-common ---
[INFO] Deleting /home/dragan/workspace/java/jcuda/jcuda-common/target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jcuda-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dragan/workspace/java/jcuda/jcuda-common/src/main/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ jcuda-common ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ jcuda-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dragan/workspace/java/jcuda/jcuda-common/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ jcuda-common ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ jcuda-common ---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ jcuda-common ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0.jar
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (create-native-sources-jar) @ jcuda-common ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0-sources.jar
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (create-native-javadoc-jar) @ jcuda-common ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0-javadoc.jar
[INFO] 
[INFO] >>> maven-source-plugin:2.1.2:jar (attach-sources) > generate-sources @ jcuda-common >>>
[INFO] 
[INFO] <<< maven-source-plugin:2.1.2:jar (attach-sources) < generate-sources @ jcuda-common <<<
[INFO] 
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar (attach-sources) @ jcuda-common ---
[INFO] No sources in project. Archive not created.
[INFO] 
[INFO] --- maven-javadoc-plugin:2.10.1:jar (attach-javadocs) @ jcuda-common ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ jcuda-common ---
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-common/0.9.0/jcuda-common-0.9.0.jar
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda-common/pom.xml to /home/dragan/.m2/repository/org/jcuda/jcuda-common/0.9.0/jcuda-common-0.9.0.pom
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0-sources.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-common/0.9.0/jcuda-common-0.9.0-sources.jar
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda-common/target/jcuda-common-0.9.0-javadoc.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-common/0.9.0/jcuda-common-0.9.0-javadoc.jar
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building jcuda-natives 0.9.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jcuda-natives ---
[INFO] Deleting /home/dragan/workspace/java/jcuda/jcuda/target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jcuda-natives ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dragan/workspace/java/jcuda/jcuda/src/main/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ jcuda-natives ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ jcuda-natives ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dragan/workspace/java/jcuda/jcuda/src/test/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ jcuda-natives ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ jcuda-natives ---
[INFO] No tests to run.
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ jcuda-natives ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0.jar
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (create-native-jar) @ jcuda-natives ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-linux-x86_64.jar
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (create-native-sources-jar) @ jcuda-natives ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-sources.jar
[INFO] 
[INFO] --- maven-jar-plugin:3.0.2:jar (create-native-javadoc-jar) @ jcuda-natives ---
[INFO] Building jar: /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-javadoc.jar
[INFO] 
[INFO] >>> maven-source-plugin:2.1.2:jar (attach-sources) > generate-sources @ jcuda-natives >>>
[INFO] 
[INFO] <<< maven-source-plugin:2.1.2:jar (attach-sources) < generate-sources @ jcuda-natives <<<
[INFO] 
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar (attach-sources) @ jcuda-natives ---
[INFO] No sources in project. Archive not created.
[INFO] 
[INFO] --- maven-javadoc-plugin:2.10.1:jar (attach-javadocs) @ jcuda-natives ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ jcuda-natives ---
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-natives/0.9.0/jcuda-natives-0.9.0.jar
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda/pom.xml to /home/dragan/.m2/repository/org/jcuda/jcuda-natives/0.9.0/jcuda-natives-0.9.0.pom
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-linux-x86_64.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-natives/0.9.0/jcuda-natives-0.9.0-linux-x86_64.jar
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-sources.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-natives/0.9.0/jcuda-natives-0.9.0-sources.jar
[INFO] Installing /home/dragan/workspace/java/jcuda/jcuda/target/jcuda-natives-0.9.0-javadoc.jar to /home/dragan/.m2/repository/org/jcuda/jcuda-natives/0.9.0/jcuda-natives-0.9.0-javadoc.jar
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Building jcuda 0.9.0
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jcuda ---
[INFO] Deleting /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jcuda ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/src/main/resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ jcuda ---
[INFO] Compiling 141 source files to /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ jcuda ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ jcuda ---
[INFO] Compiling 12 source files to /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ jcuda ---
[INFO] Surefire report directory: /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/target/surefire-reports

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running jcuda.test.JCudaDriverPrimaryContextTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.608 sec
Running jcuda.test.JCudaPointersToPointerTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.081 sec
Running jcuda.test.TestPointerToBuffer
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec
Running jcuda.test.JCudaBasicBindingTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.141 sec
Running jcuda.test.JCudaDriverTextureTest
Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: nvcc process exitValue 1
Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: errorMessage:
/usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"

/usr/include/bits/floatn.h(73): error: identifier "__float128" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004459_00000000-6_JCudaDriverTextureTestKernels.cpp1.ii".

Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: outputMessage:

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.145 sec <<< FAILURE!
testTextures(jcuda.test.JCudaDriverTextureTest)  Time elapsed: 0.145 sec  <<< ERROR!
jcuda.CudaException: Could not create ptx file: /usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"

/usr/include/bits/floatn.h(73): error: identifier "__float128" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004459_00000000-6_JCudaDriverTextureTestKernels.cpp1.ii".

	at jcuda.test.JCudaTestUtils.invokeNvcc(JCudaTestUtils.java:132)
	at jcuda.test.JCudaTestUtils.preparePtxFile(JCudaTestUtils.java:39)
	at jcuda.test.JCudaDriverTextureTest.testTextures(JCudaDriverTextureTest.java:114)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running jcuda.test.TestPointerGetByteBuffer
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec
Running jcuda.test.JCudaMemcpy3DTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec
Running jcuda.test.JCudaKernelParamsTest
Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: nvcc process exitValue 1
Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: errorMessage:
/usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"

/usr/include/bits/floatn.h(73): error: identifier "__float128" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004463_00000000-6_JCudaKernelParamsTestKernel.cpp1.ii".

Dec 15, 2017 3:41:49 PM jcuda.test.JCudaTestUtils invokeNvcc
SEVERE: outputMessage:

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.174 sec <<< FAILURE!
testKernelParams(jcuda.test.JCudaKernelParamsTest)  Time elapsed: 0.174 sec  <<< ERROR!
jcuda.CudaException: Could not create ptx file: /usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"

/usr/include/bits/floatn.h(73): error: identifier "__float128" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004463_00000000-6_JCudaKernelParamsTestKernel.cpp1.ii".

	at jcuda.test.JCudaTestUtils.invokeNvcc(JCudaTestUtils.java:132)
	at jcuda.test.JCudaTestUtils.preparePtxFile(JCudaTestUtils.java:39)
	at jcuda.test.JCudaAbstractKernelTest.initialize(JCudaAbstractKernelTest.java:51)
	at jcuda.test.JCudaKernelParamsTest.testKernelParams(JCudaKernelParamsTest.java:60)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running jcuda.test.JCudaDriverMemRangeTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.155 sec

Results :

Tests in error: 
  testTextures(jcuda.test.JCudaDriverTextureTest): Could not create ptx file: /usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"(..)
  testKernelParams(jcuda.test.JCudaKernelParamsTest): Could not create ptx file: /usr/include/bits/floatn.h(61): error: invalid argument to attribute "__mode__"(..)

Tests run: 20, Failures: 0, Errors: 2, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] JCuda .............................................. SUCCESS [  0.542 s]
[INFO] jcuda-common ....................................... SUCCESS [  0.344 s]
[INFO] jcuda-natives ...................................... SUCCESS [  0.079 s]
[INFO] jcuda .............................................. FAILURE [  2.775 s]
[INFO] jcublas-natives .................................... SKIPPED
[INFO] jcublas ............................................ SKIPPED
[INFO] jcufft-natives ..................................... SKIPPED
[INFO] jcufft ............................................. SKIPPED
[INFO] jcurand-natives .................................... SKIPPED
[INFO] jcurand ............................................ SKIPPED
[INFO] jcusparse-natives .................................. SKIPPED
[INFO] jcusparse .......................................... SKIPPED
[INFO] jcusolver-natives .................................. SKIPPED
[INFO] jcusolver .......................................... SKIPPED
[INFO] jnvgraph-natives ................................... SKIPPED
[INFO] jnvgraph ........................................... SKIPPED
[INFO] jcuda-main ......................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.857 s
[INFO] Finished at: 2017-12-15T15:41:49+01:00
[INFO] Final Memory: 27M/317M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project jcuda: There are test failures.
[ERROR] 
[ERROR] Please refer to /home/dragan/workspace/java/jcuda/jcuda/JCudaJava/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :jcuda

French locale issue with nvrtc

While using JCuda on linux and compile cuda code with nvrtc and a french locale, every static floating values are "cut" after the dot.
2.2e-10 become 2.0 and 3.14 become 3.0.
Setting LC_ALL=C in the environnement bypass this issue.

Consider implementing equals and hashCode for value classes

There are several classes in JCuda that can be considered as "value classes": Plain old data structures that are direct ports of the structs in C, using only public fields.

In #6 it was proposed to add equals and hashCode implementations to (one of) these classes.

While it might be appropriate and in line with certain usage patterns in Java, it might distort the semantics in other cases. Some of these value classes contain Pointer objects, which may make an appropriate implementation of equals and hashCode difficult and involve subtle caveats.

For now, the trade-off between usefulness and possible caveats for me seems to be in favor of not implementing these methods. One could also say that the C structs are not C++ classes and do not overload the == operator - and even less are considered to be used as keys in an unordered_map. However, I'll leave this issue here open for comments, until a final decision is made.

CUDA 9 support?

@gpu

Hi Marco,

Since CUDA 9 has been released, do you have any (approximate and non-binding, of course) schedule for supporting it in JCuda and related libraries?

Thanks for creating JCuda in the first place :)

Coding errors in JCudaMemcpy3DTest.java

I think the following applies:

--- JCudaMemcpy3DTest.java.old  2019-01-06 11:20:49.604423147 +0100
+++ JCudaMemcpy3DTest.java  2019-01-06 11:25:11.583113323 +0100
@@ -59,7 +59,7 @@
         ByteBuffer hostOutputData = 
             ByteBuffer.allocate(sizeFloats * Sizeof.FLOAT);
         FloatBuffer hostOutputBuffer = 
-            hostInputData.order(ByteOrder.nativeOrder()).asFloatBuffer();
+            hostOutputData.order(ByteOrder.nativeOrder()).asFloatBuffer();
         
         // Run the 3D memory copy
         copy(extentFloats, 
@@ -114,7 +114,7 @@
         dtoh.dstPtr.pitch  = extentFloats.width * Sizeof.FLOAT;
         dtoh.dstPtr.xsize  = extentFloats.width;
         dtoh.dstPtr.ysize  = extentFloats.height;
-        htod.extent.width  = extentFloats.width * Sizeof.FLOAT;
+        dtoh.extent.width  = extentFloats.width * Sizeof.FLOAT;
         dtoh.extent.height = extentFloats.height;
         dtoh.extent.depth  = extentFloats.depth;
         dtoh.kind          = cudaMemcpyDeviceToHost;

Extend and update getByteBuffer method

The current implementation of Pointer#getByteBuffer expects a byteOffset and a byteLength parameter. There should probably be a parameterless convenience method that returns a slice of the internal byte buffer with position=0 and limit=capacity(), representing the full buffer.

Additionally, the getByteBuffer method currently returns a byte buffer with unspecified byte order. (By default, it will be BIG_ENDIAN, but this is not specified). This might be changed so that it will always return the byte buffer in native byte order (which will usually be LITTLE_ENDIAN).

The problem here is that this would be a change that is not backward compatible. It might, however, be justified to change it and let it be the native byte order from now on, considering that the byte order was not specified until now, and everybody who used the byte buffer without explicitly setting the byte order relied on unspecified behavior anyhow.

If this is changed, it will have to be pointed out prominently in the change logs.

JCuda 12.0.0 not working?

I am trying to use jcuda 12.0.0 from https://repo.maven.apache.org/maven2/org/jcuda/jcuda/ but it does not seem to work? I created a directory with JCudaRuntimeTest.java and the .jar files for jcuda and jcuda-natives, but I got the following errors:

javac -cp ".;jcuda-12.0.0.jar;jcuda-natives-12.0.0-linux-x86_64.jar" JCudaRuntimeTest.java
JCudaRuntimeTest.java:1: error: package jcuda does not exist
import jcuda.*;
^
JCudaRuntimeTest.java:2: error: package jcuda.runtime does not exist
import jcuda.runtime.*;
^
JCudaRuntimeTest.java:7: error: cannot find symbol
        Pointer pointer = new Pointer();
        ^
  symbol:   class Pointer
  location: class JCudaRuntimeTest
JCudaRuntimeTest.java:7: error: cannot find symbol
        Pointer pointer = new Pointer();
                              ^
  symbol:   class Pointer
  location: class JCudaRuntimeTest
JCudaRuntimeTest.java:8: error: cannot find symbol
        JCuda.cudaMalloc(pointer, 4);
        ^
  symbol:   variable JCuda
  location: class JCudaRuntimeTest
JCudaRuntimeTest.java:10: error: cannot find symbol
        JCuda.cudaFree(pointer);
        ^
  symbol:   variable JCuda
  location: class JCudaRuntimeTest
6 errors

When I use mvn clean install to build, I get the following:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running jcuda.jcudnn.JCudnnBasicBindingTest
/usr/lib/jvm/java-11-openjdk-amd64/bin/java: symbol lookup error: /tmp/libJCudnn-12.0.0-linux-x86_64.so: undefined symbol: cudnnActivationBackward

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for jcuda-main 12.0.0:
[INFO] 
[INFO] JCuda .............................................. SUCCESS [  0.513 s]
[INFO] jcuda-common ....................................... SUCCESS [  0.327 s]
[INFO] jcuda-natives ...................................... SUCCESS [  0.910 s]
[INFO] jcuda .............................................. SUCCESS [  5.338 s]
[INFO] jcublas-natives .................................... SUCCESS [  0.082 s]
[INFO] jcublas ............................................ SUCCESS [  3.896 s]
[INFO] jcufft-natives ..................................... SUCCESS [  0.024 s]
[INFO] jcufft ............................................. SUCCESS [  1.253 s]
[INFO] jcurand-natives .................................... SUCCESS [  0.031 s]
[INFO] jcurand ............................................ SUCCESS [  1.260 s]
[INFO] jcusparse-natives .................................. SUCCESS [  0.085 s]
[INFO] jcusparse .......................................... SUCCESS [  2.021 s]
[INFO] jcusolver-natives .................................. SUCCESS [  0.073 s]
[INFO] jcusolver .......................................... SUCCESS [  1.922 s]
[INFO] jcudnn-natives ..................................... SUCCESS [  0.039 s]
[INFO] jcudnn ............................................. FAILURE [  0.284 s]
[INFO] jcuda-main ......................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  18.174 s
[INFO] Finished at: 2023-09-04T16:38:33+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test (default-test) on project jcudnn: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.12.4:test failed: The forked VM terminated without saying properly goodbye. VM crash or System.exit called ? -> [Help 1]

Is there something on my end that I am doing wrong?

CUDA_ERROR_ILLEGAL_ADDRESS

After the last problem (https://github.com/jcuda/jcuda/issues/20) was fixed, I started optimizing the code and another problem appeared:

Exception in thread "main" jcuda.CudaException: CUDA_ERROR_ILLEGAL_ADDRESS
	at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:337)
	at jcuda.driver.JCudaDriver.cuMemFree(JCudaDriver.java:4022)
	at GPUClasses.GPUAI.layer(GPUAI.java:95)
	at GPUClasses.GPUAI.run(GPUAI.java:70)
	at AI_Engine.AIEngine.runEngine(AIEngine.java:122)
	at Main.MainGameLoop.run(MainGameLoop.java:68)
	at Main.MainGameLoop.main(MainGameLoop.java:48)

The strange thing is also, that it doesn't occure on the first few "runs", it only comes after exactly 1525970000000 millisecounds, that's about 25'000 calles of run() (look at the code below).
Also only if it needs to calculate with an array of about the lenght 200.

To explain my changes, I first need to tell what the program actually should do.
If your familiar with neural networks, you will know that they have layers, which contain weights and baises.
You pass a input to the network and it gets passed through every layer (input*weights+baises) and the weights and baises don't change most of the time.
So I made that those two are permenant on the gpu and only get exchanged if they get changed. Also i made so, that every output (except the last one) stays on the gpu so the next layer can directly use it.

here is the important code (exeption lines are marked with //ERROR and have the same exeption):

        private CUdeviceptr[] weightPointers;
	private CUdeviceptr[] baisesPointers;
	private CUdeviceptr temp;//the output of last layer

	public GPUAI(int Generationsteps, int[] layout) {

		weightPointers = new CUdeviceptr[weights.length];
		for (int i = 0; i < weights.length; i++) {
			weightPointers[i] = new CUdeviceptr();
			cuMemAlloc(weightPointers[i], weights[i].length * Sizeof.FLOAT);
			cuMemcpyHtoD(weightPointers[i], Pointer.to(weights[i]), weights[i].length * Sizeof.FLOAT);
		}

		baisesPointers = new CUdeviceptr[baises.length];
		for (int i = 0; i < baises.length; i++) {
			baisesPointers[i] = new CUdeviceptr();
			cuMemAlloc(baisesPointers[i], baises[i].length * Sizeof.FLOAT);
			cuMemcpyHtoD(baisesPointers[i], Pointer.to(baises[i]), baises[i].length * Sizeof.FLOAT);
		}
	}

	public void run() {
		temp = new CUdeviceptr();
		cuMemAlloc(temp, Inputs.length * Sizeof.FLOAT);
		cuMemcpyHtoD(temp, Pointer.to(Inputs), Inputs.length * Sizeof.FLOAT);

		int last_layer = Inputs.length;
		for (int i = 0; i < weights.length-1; i++) {

			layer(i, last_layer);
			last_layer = baises[i].length;
		}
		Outputs = final_layer(weights.length-1, last_layer);
	}

	void layer(int layer, int inputsize) {
		// Allocate device output memory
		CUdeviceptr deviceOutput = new CUdeviceptr();
		cuMemAlloc(deviceOutput, baises.length * Sizeof.FLOAT);
		// Set up the kernel parameters: A pointer to an array
		// of pointers which point to the actual values.
		Pointer kernelParameters = Pointer.to(
                                Pointer.to(new int[] { inputsize }),
				Pointer.to(new int[] { baises[layer].length }),
                                Pointer.to(temp), 
                                Pointer.to(weightPointers[layer]),
				Pointer.to(baisesPointers[layer]), 
                                Pointer.to(deviceOutput));

		// Call the kernel function.
		int blockSizeX = 1000;
		int gridSizeX = (int) Math.ceil((double) baises.length / blockSizeX);
		cuLaunchKernel(AIEngine.LayerFunction, 
                                gridSizeX, 1, 1, // Grid dimension
				blockSizeX, 1, 1, // Block dimension
				0, null, // Shared memory size and stream
				kernelParameters, null // Kernel- and extra parameters
		);

                cuMemFree(temp); //ERROR
		temp = deviceOutput;

	}

	float[] final_layer(int layer, int inputsize) {
		// Allocate device output memory
		CUdeviceptr deviceOutput = new CUdeviceptr();
		cuMemAlloc(deviceOutput, baises.length * Sizeof.FLOAT);
		// Set up the kernel parameters: A pointer to an array
		// of pointers which point to the actual values.
		Pointer kernelParameters = Pointer.to(
                                Pointer.to(new int[] { inputsize }),
				Pointer.to(new int[] { baises[layer].length }), 
                                Pointer.to(temp), 
                                Pointer.to(weightPointers[layer]),
				Pointer.to(baisesPointers[layer]), 
                                Pointer.to(deviceOutput));

		// Call the kernel function.
		int blockSizeX = 1024;
		int gridSizeX = (int) Math.ceil((double) baises.length / blockSizeX);
		cuLaunchKernel(AIEngine.LayerFunction, 
                                gridSizeX, 1, 1, // Grid dimension
				blockSizeX, 1, 1, // Block dimension
				0, null, // Shared memory size and stream
				kernelParameters, null // Kernel- and extra parameters
		);

		

		float hostOutput[] = new float[baises.length];
		cuMemcpyDtoH(Pointer.to(hostOutput), deviceOutput, baises.length * Sizeof.FLOAT); //ERROR
		
		cuMemFree(temp); //ERROR
		
		return hostOutput;
	}

	void setWeights(float[][] weights) {
		for (int i = 0; i < weightPointers.length; i++) {
			cuMemFree(weightPointers[i]);
		}
		weightPointers = new CUdeviceptr[weights.length];
		for (int i = 0; i < weights.length; i++) {
			weightPointers[i] = new CUdeviceptr();
			cuMemAlloc(weightPointers[i], weights[i].length * Sizeof.FLOAT);
			cuMemcpyHtoD(weightPointers[i], Pointer.to(weights[i]), weights[i].length * Sizeof.FLOAT);
		}
	}

	void setBaises(float[][] baises) {
		for (int i = 0; i < baisesPointers.length; i++) {
			cuMemFree(baisesPointers[i]);
		}
		baisesPointers = new CUdeviceptr[baises.length];
		for (int i = 0; i < baises.length; i++) {
			baisesPointers[i] = new CUdeviceptr();
			cuMemAlloc(baisesPointers[i], baises[i].length * Sizeof.FLOAT);
			cuMemcpyHtoD(baisesPointers[i], Pointer.to(baises[i]), baises[i].length * Sizeof.FLOAT);
		}
	}

Bump to latest version for Apache Maven JAR Plugin

Hello team,

Thank you for this project.

As of today, the plugin version is:

   <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.0.2</version>

Could you please bump it to:

<!-- https://mvnrepository.com/artifact/org.apache.maven.plugins/maven-jar-plugin -->
<dependency>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-jar-plugin</artifactId>
    <version>3.4.1</version>
</dependency>

This latest version fixes some vulnerabilities.
Would be great to have this one-line change.

Good day!

lunching JCUDA

hello, I'am having a big issue lunching a java program using JCUDA, i did the installation , all paths are well sets, the test program of JCUDA compile without problems, but when i execute i got this

Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load the native library.
Error while loading native library "JCudaRuntime-linux-x86_64" with base name "JCudaRuntime"
Operating system name: Linux
Architecture : amd64
Architecture bit size: 64
Stack trace from the attempt to load the library as a resource:
java.lang.NullPointerException: No resource found with name '/lib/libJCudaRuntime-linux-x86_64.so'
at jcuda.LibUtils.loadLibraryResource(LibUtils.java:149)
at jcuda.LibUtils.loadLibrary(LibUtils.java:83)
at jcuda.runtime.JCuda.initialize(JCuda.java:380)
at jcuda.runtime.JCuda.(JCuda.java:367)
at JCudaRuntimeTest.main(JCudaRuntimeTest.java:9)
Stack trace from the attempt to load the library as a file:
java.lang.UnsatisfiedLinkError: /home/zournani/JCuda/libJCudaRuntime-linux-x86_64.so: /lib/libc.so.6: version `GLIBC_2.14' not found (required by /home/zournani/JCuda/libJCudaRuntime-linux-x86_64.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1847)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1119)
at jcuda.LibUtils.loadLibrary(LibUtils.java:94)
at jcuda.runtime.JCuda.initialize(JCuda.java:380)
at jcuda.runtime.JCuda.(JCuda.java:367)
at JCudaRuntimeTest.main(JCudaRuntimeTest.java:9)

at jcuda.LibUtils.loadLibrary(LibUtils.java:128)
at jcuda.runtime.JCuda.initialize(JCuda.java:380)
at jcuda.runtime.JCuda.<clinit>(JCuda.java:367)
at JCudaRuntimeTest.main(JCudaRuntimeTest.java:9)

am stuck with this problem for a week, please help

Cuda Data types definition - there is no unsigned integer in Java!

CUDA_R_32U in cudaTypes enumeration contains definitions for CUDA_R_32U and CUDA_C_32U. Java has no unsigned integer type and never had it. All integers i Java are signed. integer biwise arithmethics works on integer but ignores sign. Starting from Java 8 Integer class provides unsigned integer bitwise arithmetics API. Also JNI has only jint type which can be cast to C ulong with certain pre-cautions.
Similary, it is impossible to pass from java to native code **int but there is no need to do it because JNI can return Integer object using

jclass intClass = (*env)->FindClass(env, "java/lang/Integer");
if (intClass == NULL) {
    return NULL;
}
jmethodID init =  (*env)->GetMethodID(env, intClass, "intValue", "()I");
if (init == NULL) {
    return NULL;
}
jobject rc_obj = (*env)->NewObject(env, intClass, init, rc);
if (rc_obj == NULL) {
    return NULL;
}

return rc_obj;

Copying to constant memory in CUDA 8?

Hello,

Is it currently possible to copy from the host to device constant memory?
I am currently using CUDA 8.0.

When I attempted to do so, I tried using

  1. JCuda.cudaMemcpyToSymbol
  2. JCuda.cudaGetSymbolAddress, then jcuda.runtime.cuMemcpyHtoD.
    Both JCuda.* functions threw a java.lang.UnsupportedOperationException, saying "This function is no longer supported as of CUDA 5.0".
    I see on jcuda.org that these functions are not supported.

In any case, is there a way around this, so that it is possible to copy data to device constant memory?

Thanks in advance.

Maven install libraries issue - pom file is never instaled

Either maven-install plagin is not running or generate pom file parameter is missed. Indeed, when I run maven install on local maven progects: jcuda,jcublas, etc. only jar files are installed in local maven repository, pom is always missed. It looks like generate pom parameter should be added explicetely.
mvn org.apache.maven.plugins:maven-install-plugin:3.0.0-M1:install-file -Dfile=path-to-your-artifact-jar
-DgroupId=your.groupId
-DartifactId=your-artifactId
-Dversion=version
-Dpackaging=jar
-DgeneratePom=true

pthread-create not found

After building the native libraries, I cannot find binaries file in nativeLibraries.
I saw some error in CMakeError.

collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_c11c9.dir/build.make:97: recipe for target 'cmTC_c11c9' failed
make[1]: *** [cmTC_c11c9] Error 1
make[1]: Leaving directory '/home/nvidia/workspace/JCuda.build/CMakeFiles/CMakeTmp'
Makefile:126: recipe for target 'cmTC_c11c9/fast' failed
make: *** [cmTC_c11c9/fast] Error 2

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /home/nvidia/workspace/JCuda.build/CMakeFiles/CMakeTmp
Run Build Command:"/usr/bin/make" "cmTC_2d5dd/fast"
/usr/bin/make -f CMakeFiles/cmTC_2d5dd.dir/build.make CMakeFiles/cmTC_2d5dd.dir/build
make[1]: Entering directory '/home/nvidia/workspace/JCuda.build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_2d5dd.dir/CheckFunctionExists.c.o
/usr/bin/cc -fPIC -DCHECK_FUNCTION_EXISTS=pthread_create -o CMakeFiles/cmTC_2d5dd.dir/CheckFunctionExists.c.o -c /usr/share/cmake-3.5/Modules/CheckFunctionExists.c
Linking C executable cmTC_2d5dd
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_2d5dd.dir/link.txt --verbose=1
/usr/bin/cc -fPIC -DCHECK_FUNCTION_EXISTS=pthread_create CMakeFiles/cmTC_2d5dd.dir/CheckFunctionExists.c.o -o cmTC_2d5dd -rdynamic -lpthreads
/usr/bin/ld: cannot find -lpthreads
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_2d5dd.dir/build.make:97: recipe for target 'cmTC_2d5dd' failed
make[1]: *** [cmTC_2d5dd] Error 1
make[1]: Leaving directory '/home/nvidia/workspace/JCuda.build/CMakeFiles/CMakeTmp'
Makefile:126: recipe for target 'cmTC_2d5dd/fast' failed
make: *** [cmTC_2d5dd/fast] Error 2

I cannot find any similar question about it. Is it an error about cmake? Or need I modify something in CMakeList?

What's the difference between JCudaDriver.cuMemAlloc and JCuda.cudaMalloc?

Thanks for your awesome job, and I have a question.

In JCudaDriver.cuMemAlloc, the first parameter is an instance of CUdeviceptr, and in JCuda.cudaMalloc, the first parameter is an instance of Pointer. And CUdeviceptr is the sub Class of Pointer. What is the difference between these 2 methods? And is there a rule to decide which method I should use in different situations? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.