I am on Win7 Ultimate x64 SP2, trying to run some JCuda examples (11.2.0) in NetBeans,

The main points are: JNvrtc-11.2.0-w

Firstly: how in the world is Germany CENTRAL EUROPE???? :-))) </block

JCuda JNvrtc-11.2.0-windows-x86_64.dll dependencies,about jcuda/jcuda

Comments (22)

jcuda commented on June 26, 2024

In general, JCuda version X.Y.Z should always work with CUDA X.Y.Z (where the "Z" part should not even matter).

Specifically, JCuda 11.2.0 should work with CUDA 11.2.

that shows me the file JNvrtc-11.2.0-windows-x86_64.dll dependency are just two other dlls, namely nvrtc64_102_0.dll from CUDA Toolkit 10.2, and system Kernel32.dll.

I just downloaded the jcuda-natives-11.2.0-windows-x86_64.jar from https://repo1.maven.org/maven2/org/jcuda/jcuda-natives/11.2.0/ , extracted the JNvrtc-11.2.0-windows-x86_64.dll, and running

dumpbin /DEPENDENTS JNvrtc-11.2.0-windows-x86_64.dll

for me prints

Microsoft (R) COFF/PE Dumper Version 14.27.29112.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file JNvrtc-11.2.0-windows-x86_64.dll

File Type: DLL

  Image has the following dependencies:

    nvrtc64_112_0.dll
    KERNEL32.dll

  Summary

        2000 .data
        2000 .pdata
        C000 .rdata
        1000 .reloc
        1000 .rsrc
       12000 .text
        1000 _RDATA

showing that it only depends on the (correct) nvrtc64_112_0.dll.

What does this print for you?

(The same is the case when using the dependency walker:

which makes me wonder where you got that file with a different dependency...)

from jcuda.

fafayqa commented on June 26, 2024

Eh, do we understand here?
As you just repeated exactly what I am asking: I wrote I found out it needs exactly those two other DLLs which I already have in the system, yet it still show that error...

And if you would check that StackOverflow link I gave, you would also see ti in visual form on images (print screens of my NetBeans project showing all the JARs and DLLs in the project), so what to answer to you now? :-()

I did even try and placed that CUDA Toolkit DLL side by side JCuda DLLs in the project, yet still the same error. Tho I did not place the Kernel32.dll in the project as I think NetBeans can find Windows system dlls, right? :-)

Anyway, tomorrow I will download the files from Maven and let you know (midinight here - Central Europe).

BTW how did you included image in your post? I do not see any option to place image for the post?

from jcuda.

jcuda commented on June 26, 2024

The main points are:

JNvrtc-11.2.0-windows-x86_64.dll should not depend on nvrtc64_102_0.dll, but on nvrtc64_112_0.dll (and I just checked that the one from Maven does depend on the latter)
You cannot use JCuda 11.2.0 with CUDA 10.2

(Other details are: You should not have to unpack the JCuda DLL files, you should not have to put them into a lib folder, you should not have to set the java.library.path, and you should not have to modify your PATH environment variable (because that's usually done by the NVIDIA CUDA installer) - but that's all not relevant when there is a wrong DLL floating around...)

I'm also in Central Europe, by the way (2:45am here).

You can attach files by dragging & dropping, selecting or pasting them. Like this image here:

from jcuda.

fafayqa commented on June 26, 2024

Ahhh, so there we have (ehm, I have) quite a lot of "problems" - firstly, I do not know why I was that blind not seeing it is nvrtc64_112_0.dll, not nvrtc64_102_0.dll - thanks for pointing that out for me!

Tho in that case I am afraid I cannot test JCuda v11+, as CUDA Toolkit can be installed on the Win7 machine only until version 10 (well, at least according to their download webpage, unless I want to go and try to install the version for Win10 which I am not sure I should). So can you tell me which version of JCuda should I be aiming for with the CUDA Toolkit 10.2, please - perhaps JCuda v10.2?

"You should not have to unpack the JCuda DLL files, you should not have to put them into a lib folder" - so does this mean those DLLs are unpacked from the JAR automatically when they are needed by the respective JCuda classes (actually I was never sure about this thing when dealing with the native DLLs inside jars, like do I have to unpack them or not - you see: until now I have never have JAVA project where I would have to deal with these, so I have no practice of correct implementing of DLLs in JAVA projects until now)?

"...you should not have to set the java.library.path" - now, as you say I do not need to unpack those DLLs from the JAR it does not need to be there, BUT in case it would be needed I had to do that else NetBeans would not know where to find those dyn libs.

"...and you should not have to modify your PATH environment variable (because that's usually done by the NVIDIA CUDA installer)" - unfortuntelly, no - on my Win7 Ult x64 it was not set automatically by the CUDA Toolkit installer at all: I checked, I had to do it myself manually (I also knew it was not set cos there were errors exactly for that claiming it cannot find nvcc, or something like that, which is in CUDA Toolkit folder).

So, I download the files from the Maven link you gave + I wait for your answer which JCuda version is suitabl;e for the CUDA Toolkit 10.2 (as that seems to be the latest version I can use on Win7 x64).

BTW if you are from CE too, is there any chance you being Slovak, Czech or Polish - in that case I would understand your native lng (other 2 options would be Austrian or Hungarian)? I am just being curios - you do not have to answer, of course. :-)

thanks for the explanation about the image in posts too (I did not noticed that line at the bottom of the post window first).

UPDATE

So I went "straight forward headlong" and downloaded JCuda 10.2 beforehand (was too thrilled to wait for your answer), hoping it would be the right version for my CUDA Toolkit 10.2 - trying to run the 1st file called "JCudaDriverSimpleJOGL.java" from the official JCuda exmaples zip file (I guess I downloaded from somewhere here), only to find out there is still something wrong, probably with the CUDA itself, as it complains it cannot find some specific file in CUDA Toolkit directory, namely:

aug 17, 2021 2:13:46 PM jcuda.samples.utils.JCudaSamplesUtils invokeNvcc
INFO: Creating ptx file for src/main/resources/kernels/JCudaDriverSimpleGLKernel.cu
aug 17, 2021 2:13:46 PM jcuda.samples.utils.JCudaSamplesUtils invokeNvcc
INFO: Executing
nvcc -m64 -ptx src/main/resources/kernels/JCudaDriverSimpleGLKernel.cu -o src/main/resources/kernels/JCudaDriverSimpleGLKernel.ptx
aug 17, 2021 2:13:48 PM jcuda.samples.utils.JCudaSamplesUtils invokeNvcc
SEVERE: nvcc process exitValue 2
aug 17, 2021 2:13:48 PM jcuda.samples.utils.JCudaSamplesUtils invokeNvcc
SEVERE: errorMessage:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000007fee080f4a0, pid=8000, tid=0x0000000000002074
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory

# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode windows-amd64 compressed oops)
# Problematic frame:
# C  [nvcuda.dll+0x22f4a0]
#
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
aug 17, 2021 2:13:48 PM jcuda.samples.utils.JCudaSamplesUtils invokeNvcc
# An error report file with more information is saved as:
SEVERE: outputMessage:
# Z:\_JAVA_\JCuda\hs_err_pid8000.log
JCudaDriverSimpleGLKernel.cu

#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Exception in thread "AWT-EventQueue-0-AWTAnimator#00" com.jogamp.opengl.util.AnimatorBase$UncaughtAnimatorException: java.lang.RuntimeException: com.jogamp.opengl.GLException: Caught CudaException: Could not create ptx file: c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory
 on thread AWT-EventQueue-0
	at com.jogamp.opengl.util.AWTAnimatorImpl.display(AWTAnimatorImpl.java:92)
	at com.jogamp.opengl.util.AnimatorBase.display(AnimatorBase.java:452)
	at com.jogamp.opengl.util.Animator$MainLoop.run(Animator.java:204)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: com.jogamp.opengl.GLException: Caught CudaException: Could not create ptx file: c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory
 on thread AWT-EventQueue-0
	at com.jogamp.common.util.awt.AWTEDTExecutor.invoke(AWTEDTExecutor.java:58)
	at jogamp.opengl.awt.AWTThreadingPlugin.invokeOnOpenGLThread(AWTThreadingPlugin.java:103)
	at jogamp.opengl.ThreadingImpl.invokeOnOpenGLThread(ThreadingImpl.java:201)
	at com.jogamp.opengl.Threading.invokeOnOpenGLThread(Threading.java:202)
	at com.jogamp.opengl.Threading.invoke(Threading.java:221)
	at com.jogamp.opengl.awt.GLCanvas.display(GLCanvas.java:505)
	at com.jogamp.opengl.util.AWTAnimatorImpl.display(AWTAnimatorImpl.java:81)
	... 3 more
Caused by: com.jogamp.opengl.GLException: Caught CudaException: Could not create ptx file: c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory
 on thread AWT-EventQueue-0
	at com.jogamp.opengl.GLException.newGLException(GLException.java:76)
	at jogamp.opengl.GLDrawableHelper.invokeGLImpl(GLDrawableHelper.java:1327)
	at jogamp.opengl.GLDrawableHelper.invokeGL(GLDrawableHelper.java:1147)
	at com.jogamp.opengl.awt.GLCanvas$12.run(GLCanvas.java:1438)
	at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:301)
	at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
	at java.awt.EventQueue.access$500(EventQueue.java:97)
	at java.awt.EventQueue$3.run(EventQueue.java:709)
	at java.awt.EventQueue$3.run(EventQueue.java:703)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
	at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
	at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
	at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
	at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
	at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
Caused by: jcuda.CudaException: Could not create ptx file: c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory

	at jcuda.samples.utils.JCudaSamplesUtils.invokeNvcc(JCudaSamplesUtils.java:170)
	at jcuda.samples.utils.JCudaSamplesUtils.preparePtxFile(JCudaSamplesUtils.java:51)
	at jcuda.driver.gl.samples.JCudaDriverSimpleJOGL.initJCuda(JCudaDriverSimpleJOGL.java:338)
	at jcuda.driver.gl.samples.JCudaDriverSimpleJOGL.init(JCudaDriverSimpleJOGL.java:274)
	at jogamp.opengl.GLDrawableHelper.init(GLDrawableHelper.java:644)
	at jogamp.opengl.GLDrawableHelper.init(GLDrawableHelper.java:667)
	at com.jogamp.opengl.awt.GLCanvas$10.run(GLCanvas.java:1407)
	at jogamp.opengl.GLDrawableHelper.invokeGLImpl(GLDrawableHelper.java:1291)
	... 16 more
C:\Users\1\AppData\Local\NetBeans\Cache\12.4\executor-snippets\run.xml:111: The following error occurred while executing this line:
C:\Users\1\AppData\Local\NetBeans\Cache\12.4\executor-snippets\run.xml:94: Java returned: 1
BUILD FAILED (total time: 5 seconds)

OK, so I try the next example file called "JCudaDriverSimpleLWJGL", and this time it is complaining about lwjgl being not added to source JAR files, although it is there also with its natives dyn libs as you can see in the pic at the bottom of this message. I had to use quite old lwjgl version from 2013, as it was the last that contained org.lwjgl.opengl.GL20.glUniformMatrix4 (cos it is required import for the exmaple file "JCudaDriverSimpleLWJGL"), and I did tested like 10 other versions (mainly lwjgl3 - this is the actual error:

Exception in thread "AWT-EventQueue-0" java.lang.UnsatisfiedLinkError: no lwjgl in java.library.path
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
	at java.lang.Runtime.loadLibrary0(Runtime.java:870)
	at java.lang.System.loadLibrary(System.java:1122)
	at org.lwjgl.Sys$1.run(Sys.java:73)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.lwjgl.Sys.doLoadLibrary(Sys.java:66)
	at org.lwjgl.Sys.loadLibrary(Sys.java:95)
	at org.lwjgl.Sys.<clinit>(Sys.java:112)
	at org.lwjgl.opengl.AWTGLCanvas.<clinit>(AWTGLCanvas.java:87)
	at jcuda.driver.gl.samples.JCudaDriverSimpleLWJGL.createCanvas(JCudaDriverSimpleLWJGL.java:287)
	at jcuda.driver.gl.samples.JCudaDriverSimpleLWJGL.<init>(JCudaDriverSimpleLWJGL.java:252)
	at jcuda.driver.gl.samples.JCudaDriverSimpleLWJGL$1.run(JCudaDriverSimpleLWJGL.java:111)
	at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
	at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
	at java.awt.EventQueue.access$500(EventQueue.java:97)
	at java.awt.EventQueue$3.run(EventQueue.java:709)
	at java.awt.EventQueue$3.run(EventQueue.java:703)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
	at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
	at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
	at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
	at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
	at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
BUILD SUCCESSFUL (total time: 2 seconds)

As for this specific error, interestingly enough, despite you said I DO NOT HAVE TO EXTRACT THE NATIVE DLLs, it showed up I HAVE TO DO EXACTLY THAT at least for LWJGL native dlls (and it also looks like that is the correct way how to import/implement native dyn libs in NetBeans, see here ) Thus my addition of the path to libs as VM Options was quite correct - once I re-added that option to VM this error disappears, tho is replaced by the same error as for the 1st and 3rd exmaple files (missing *.h file in CUDA Toolkit dir)...but maybe you mean it like I do not have to do that just for the JCuda? Mess...more clarification needed from you.

I even tried the 3rd example file in a row, called "JCudaDriverVolumeRendererJOGL" and it gives basically the same error as the 1st example file about missing "crtdefs.h" file in CUDA Toolkit directory.

So 3 example files and none is working...but I guess I have still something missing here, so I will now kindly wiat for your answer, if you would be willing to help.

These are the jar I added for the project (I guess those are all that are needed according to NetBeans - no errors of missing files or code):

Waiting for your soon answer.

from jcuda.

jcuda commented on June 26, 2024

So can you tell me which version of JCuda should I be aiming for with the CUDA Toolkit 10.2, please - perhaps JCuda v10.2?

Exactly. When you have CUDA X.Y.(Z), then you should use JCuda X.Y.Z.

...so does this mean those DLLs are unpacked from the JAR automatically when they are needed by the respective JCuda classes

Accessing DLLs from Java is (and always was) difficult: Users needed the right DLL (or DYLIB on Linux), and they had to put them into the right place, and set the java.library.path and all that. This was complicated, and error prone, and in most cases, caused some form of UnsatisfiedLinkError. (And this error could have many different reasons, and it was hard to find out the exact reason).

On top of that: Native libraries (like DLLs and DYLIBs) cannot be distributed via Maven Central. So people needed a different solution. And a common solution nowadays is exactly that: The native libraries are packed into a JAR. At runtime, the native library is unpacked into the "TEMP" directory. And from there, it is loaded. In JCuda, this is done with the LibUtils class. Other libraries use a similar approach.

Since then, the number of UnsatisfiedLinkErrors (for me) has decreased. As long as people use the right JCuda version for the installed CUDA version, it usually just works.

[about PATH] : ... unfortuntelly, no - on my Win7 Ult x64 it was not set automatically by the CUDA Toolkit installer at all

That's an issue with the installer then. And ... of course, Win7 is not entirely up to date. Maybe they are no longer trying to support that properly. (I mainly updated from Win8 to Win10 because CUDA was no longer available for Win8 - ouch...)

BTW if you are from CE too, is there any chance you being Slovak, Czech or Polish - in that case I would understand your native lng (other 2 options would be Austrian or Hungarian)? I am just being curios - you do not have to answer, of course. :-)

I'm from Germany, so I guess, "Austrian" would count (или по-русски, but I don't really know that very well...). But usually, English is the smallest common denominator for communication on GitHub.

UPDATE

... trying to run the 1st file called "JCudaDriverSimpleJOGL.java" from the official JCuda exmaples

The interoperability with JOGL is one of the more "complex" examples. I had seen the JOGAMP libs in your images, so this might be the "final" use-case that you're aiming at, but maybe you should try some of the basic examples first. Until then, only a short hint: The message "UnsatisfiedLinkError: no lwjgl in java.library.path" is the "usual" one, related to all the difficulties of native libraries that I mentioned above. A workaround can be to do something like

System.setProperty("org.lwjgl.librarypath", "C:\PathThatContainsTheLwjgl64Dll");

in your main, but for LWJGL, there's the additional difficulty of LWJGL2 vs. LWJGL3. I'd recommend to try out the JOGL example, that might be easier.

The "core" of the error message for the JOGL example is

c:\program files\nvidia gpu computing toolkit\cuda\v10.2\include\crt/host_config.h(236): fatal error C1083: Cannot open include file: 'crtdefs.h': No such file or directory

Long story short: It tries to compile the CUDA code at runtime into a PTX ("CUDA assembler") file, using the NVCC, and this does not work (and this may have thousands of reasons).

As a first, quick test: The most basic example that does not require any external compilation or PTX files is the sample at https://github.com/jcuda/jcuda-samples/blob/master/JCudaSamples/src/main/java/jcuda/nvrtc/samples/JNvrtcVectorAdd.java (I think that the nvRTC was already available in CUDA 10.2...). If that works, that would be a first step.

An aside: If you think that JCuda is too complicated to use or set up: https://github.com/bytedeco/sample-projects/tree/master/cuda-vector-add-driverapi offers an infrastructure that does not require end-users to install CUDA. They are distributing the actual NVIDIA CUDA libraries via Maven Central. (We're talking about >1GigaByte JARs here, but ... for the end-user, it may be easier...)

from jcuda.

fafayqa commented on June 26, 2024

Firstly: how in the world is Germany CENTRAL EUROPE???? :-))) As far as I know it is west...just to clarify why I was asking you, cos you said you are too from CE (just to be fair as you said yer nationality, I am Slovak, thus using russian Azbuka is not the correct one: we use Latin just like you do, tho we are slavic as russians are + the older of us, incl. myself, understand it as we learned it in school when we have socialism here from 1948-1989).

WEST: everything CE states westward :-D
CE = Czechia, Slovakia, Poland, Austria, Hungary (these are also knowm as Visegrad 4 + Austria)
EAST: 3 baltic states, Ukraine, Belarus, Moldova, Bulgaria, Romania + Russia (tho it is actualy euroasiatic state spreading accross both Europe and Asia, as you prolly know)

ex-Yugoslavia states are somewhere in between CE and EAST, hard to tell :-)

Now to the main thing...

I just tested that "JNvrtcVectorAdd.java.java", and also "JNvrtcLoweredNames" in the same dir: both works quickly and flawlessly without any problem - thanks for the easy tip, I will now explore the code trying to understand the implementation (fingers crossed, see I am not JAVA genius)!

All examples from these do work:

jcuda.runtime.samples
jcuda.jcusolver.samples
jcuda.jcurand.samples
jcuda.jcufft.samples
jcuda.jcublas.samples
jcuda.driver.samples mostly works, except these 5: JCudaAllocationInKernel, JCudaConstantMemoryExample, JCudaDynamicParallelism, JCudaReduction, JCudaVectorAdd

On the other hand, NONE of examples from these works at all:

jcuda.driver.gl.samples
jcuda.jcudnn.samples

As for the other stuff about the native dyn libs: at the moment it is a bit mess for me, like my head is going to explode, but I will go thru it several times hoping I understand exactly what has been said in your post, as I am not sure now if I can use JCuda for my cause (I want to port code of my JAVA renderer so that it would compute most of the code on GPU cores) or if it is out of the question for either being too complicated to do or simply not possible right away for the dyn libs errors and stuff..

In fact at the moment I ended up with 3 possible ways:

JCuda
Rootbeer
TorndoVM

I also did successfully tried Aparapi3 but it is so little applicable unless one do zillions of mega-long array operations thus it is not in my radar annymore tho its impementation is mega-easy and quite straighforward, or I simply did not understood what it can do. :-)

I am at the process of trying to test and understand each and then choosing the right one. Mind you: I had no previous experience of GPU integration in JAVA, so...maybe that is out of question for me anyway considering my actual JAVA programming skills, we'll see.

from jcuda.

jcuda commented on June 26, 2024

Firstly: how in the world is Germany CENTRAL EUROPE???? :-)))

Well, I find it a bit old-fashioned to "classify" this, and as you said: Sometimes it's hard to tell. And I just mentioned Russian because I've learned a few words of Russian in school (even though I can barely speak and hardly read it), and I know that many people in "the eastern part of Central Europe" also learned that in school. (Back in the days when some people considered East Germany to be part of "Eastern Europe" :D )

jcuda.driver.samples mostly works, except these ...

Some of these may use features that are not available on older GPUs. But for example, the JCudaVectorAdd is basically the "Hello World of GPU programming", and it should work. When it does not work, then the reason is (as it can be seen in the error message that you posted) :

The actual CUDA kernel is contained in a .cu file
In order to load and execute this kernel, it has to be a PTX- or CUBIN file ("CUDA assembler" or "CUDA binary")
In order to convert the .cu file into a .ptx file, you have to compile it with the NVCC (NVIDIA CUDA C Compiler)
In the samples, this is done at runtime. But apparently, you don't have set up the NVCC (which requires Visual Studio, or another C compiler to be available).

Note that this "runtime compilation with the NVCC" is only done in the samples, for convenience. In a "real application", you could

Compile the .cu file into a .ptx file locally
Ship the PTX file with your application
Load the PTX file directly (without having to call the NVCC first)

Fortunately, NVIDIA added the "nvRTC" in CUDA 10, which allows you to compile the .cu code at runtime, but without needing the NVCC or any other external compiler. (This is what is done in the JNvrtc examples).

On the other hand, NONE of examples from these works at all:

For JCudnn, you need the cuDNN DLLs from NVIDIA (and you have to be a registered developer to get them). This should not matter for you.

For the GL-related examples: This does matter for you, obviously. The issue of native libraries is one thing, but you have the setup for a basic LWJGL- or JOGL-application right, then that should be fine. The problem is then only the same as for the other samples: The GL samples try to compile the JCudaDriverSimpleGLKernel.cu (to create the PTX file).

(I could compile that PTX file for you, so that you can try it out, or you could set up Visual Studio and the NVCC, or you could try to change the sample to use the nvRTC (which may be more effort), but I don't know the best path forward here).

I want to port code of my JAVA renderer so that it would compute most of the code on GPU cores
...

I think that Rootbeer was rather experimental, and has not been actively maintained. I cannot say much about TornadoVM (although having a closer look a this is on my TODO list). JCuda is one option, but you have to be aware of the fact that users will have to install the matching CUDA Toolkit. JavaCPP may be a simpler option here.

But more generally: Is that "Java renderer" available somewhere? I wonder whether this is in fact something that could benefit from computations on the GPU. I think that some people may have wrong expectations about the use-cases and benefits of GPU computing. I once wrote an answer on stack overflow, trying to explain the cases where the GPU really makes sense, maybe that's helpful here.

from jcuda.

fafayqa commented on June 26, 2024

Ha, so did I understand you right that you're from the ex-DDR? If so, I was as a child sometimes around 1985 I guess or 1986 in Wernemunde (or how it was called, such a small town - they have even their tiny zoo there!), an international pioneers camp in the woods, we then go to see a sea to Rostock (I remember even today as 45 yrs old guy how I was afraid going to visit one big ship where they lead us to its bottom stages, literally an Aliens-movie-feeling :-D )

BTW a side interesting thing dealing with germanlng: because I live like some 40km from Austrian border, we as children were pretty used to listening to all sorts of austrian stuff going in-the-air, like their TV, radio etc., so it is well known fact that when children are exposed for long period of time to something, let's say foreign lng, and although they do not have any knowledge of that lng, they become so used to it that they actually begin to understand it despite not speaking that lng. And, well, that basically happened to us: I understand somewhat german lng still, tho it is not that good at all as it used to be when I was a kid in 80's. :-D Eventoday I have this strange obsession speaking pseudo-german to ppl, which is mad eof real german words but the sentence context is absolutelly dadaistic nonsensical....I have a friend in germany to whom from time to time I wrote in this way and he seems to hardly catch his breath as he is laughing to his death almost! :-D

Oh, and how did you do that text citation? I again cannot seem to find option for that.

OK, back to our staff again now...

So let's forgot all these examples-files issues, let's directly talk then about some example of my code I would post here, and you could tell me if GPU can actual compute that block or if I am completely mislead about what GPU can do for me in JAVA as of computing code block, OK?

As for the Rootbeer: yes, I was trying to somehow contact the guy but without success, his last addition was around 2016, which is also telling a lot, so most probably I give up on it too.

But more generally: Is that "Java renderer" available somewhere?
No, it is not available publicly (tho it was yrs ago in its early stages): it is my internal tool for virtual LEGO models to be rendered into photorealistic images, like this one I made sometime ago for the community over at IG:

Actually the reason I think the GPU could help me (be used for this) is cos other LEGO rendering app, but not java - it is "normal" Win32 app, called Stud.io is using exactly that - GPU CUDA for render where the stuff that would took several hrs (2-5 hrs depending on complexity of the scene in FullHD res 1920x1080 pixels) is made literally in few minutes (like 5-15 mins). Thus I am so interested in this transformation of my code, you see.

Now with all that being said: could you, please, look at the JAVA file (included ZIP file at the bottom of this post) and tell me if it can be ported/modified so that it would be compiled on GPU cores using JCuda instead of 6-core CPU? You know: so if not we can stop this discussion (and possible unnecessary bothering you) cos it would mean JCuda cannot be used for what I want it to use.

There are other zillions of classes there in my JAVA rendering app (based on long time abandoned great renderer called Sunflow) that basically just prepare all for the render, and this specific class is the one that actually do that render part, so if nothing else, I would like to mod/ported at least this specific one class to JCuda code (if possible).

As you will see, what it basically does in the beggining - *where I am imagining the main modificaiton would occur - is that it check how many cores CPU has (6 in my case, but I manually changed it to 5 so my system would not become clogged with all the computing, so that I have some room for other normal operations outside the rendering process), "cuts" the image size into 64x64 pixel blocks and let each core compute pixels for one of them - once its done it moves to another 64x64 rectangle in line and so on untill the whole image is rendered - it means 5 64x64 rectangles are computed at a time at once.

What I am imagining JCuda should/would/could do for me here, is let all these 64x64 blocks to be computed on all available GPU cores, so we are talking about thousands at once (NVIDIA GeForce GTX 750 5GB DDR5 here) - am I right with this assumption?

So please, look at the file just quickly if you think it can or cannot be ported/modded for the JCUda usage, if you can (fingers crossed you can).

BucketRenderer.zip

from jcuda.

jcuda commented on June 26, 2024

Ha, so did I understand you right that you're from the ex-DDR?

Actually not (but I get asked that a lot when mentioning Russian). There had been a few schools in the West where it was possible to learn Russian, and that was in the early 90s, and I thought: "Hey, that might be useful one day". That was wrong in some way. I never really used it. But I think that learning other languages can be enriching in the best sense. (And even when you only know a few words for smalltalk, native speakers usually greatly appreciate these efforts).

Oh, and how did you do that text citation? I again cannot seem to find option for that.

These citations can be created by

> such a 'greater than' sign at the beginning of the line

Which is also inserted by hitting the "Insert a quote" button (the fourth button in the toolbar).

That image looks impressive. It looks like it was scanned from somewhere, but I couldn't have told whether this was a photo or a rendering. The shadows and the light ("caustics") from the tansparent elements look absolutely realistic.

I have quickly scrolled over the attached file. It looks like there might be some operations that could benefit from the GPU (e.g. the bilerp stuff). But I'll have to take a closer look: All that depends on how well the actual computations can be "flattened" out into plain arrays. For example, when there is an extremely time-critical code path that, in the innermost loop, does some call like scene.getRadiance, then all boils down to the question: What does getRadiance do? (This comes from the "engine", and cannot just be executed on the GPU).

I'll definitely try to have a closer look at this ASAP, maybe during the weekend, and try to match some of the code with https://github.com/fpsunflower/sunflow to get a better idea here.

(An aside: If this can benefit from the GPU, then you might also consider to just use OpenCL. It has some advantages compared to CUDA (and I mentioned that in the stack overflow answer that I linked to), and may keep it far easier to use, and far more portable).

from jcuda.

fafayqa commented on June 26, 2024

That image looks impressive. It looks like it was scanned from somewhere, but I couldn't have told whether this was a photo or a rendering. The shadows and the light ("caustics") from the tansparent elements look absolutely realistic.

Thank you! But to be honest, the final touch is always done with Photoshop in a sense, that I just kind-of equalize the shadow deepness (you know: it can show things that would be normally hidden too much in the shadows), brighten up whiteness (it can add that nice "wow" effects that live-up the colors, take them up from the muddiness) etc. Just for comparison this is the original unedited direct render file without any Photoshop editing at all:

What does getRadiance do?

Radiance according to methods description: "Get the radiance seen through a particular pixel". I guess it is actually affecting the color with lights from the scene, I guess (not quite sure as it comes from the original Sunflow code and I had no reason searching too much for its meaning anyway :-D ).

/*
  Get the radiance seen through a particular pixel
 
  @param istate intersection state for ray tracing
  @param rx pixel x coordinate
  @param ry pixel y coordinate
  @param lensU DOF sampling variable
  @param lensV DOF sampling variable
  @param time motion blur sampling variable
  @param instance QMC instance seed
  @return a shading state for the intersected primitive, or
  <code>null</code> if nothing is seen through the specifieFd point
 */
public ShadingState getRadiance(IntersectionState istate, float rx, float ry, double lensU, double lensV, double time, int instance, BusuflGUI GUI) {
    
    if (bakingPrimitives == null) {
        Ray r = camera.getRay(rx, ry, imageWidth, imageHeight, lensU, lensV, time);
        return r != null ? lightServer.getRadiance(rx, ry, instance, r, istate, GUI) : null;
        
    } else {
        Ray r = new Ray(rx / imageWidth, ry / imageHeight, -1, 0, 0, 1);
        traceBake(r, istate, GUI);
        if (!istate.hit()) {
            return null;
        }
        ShadingState state = ShadingState.createState(istate, rx, ry, r, instance, lightServer);
        bakingPrimitives.prepareShadingState(state);
        if (bakingViewDependent) {
            state.setRay(camera.getRay(state.getPoint()));
        } else {
            Point3 p = state.getPoint();
            Vector3 n = state.getNormal();
            // create a ray coming from directly above the point being shaded
            Ray incoming = new Ray(p.x + n.x, p.y + n.y, p.z + n.z, -n.x, -n.y, -n.z);
            incoming.setMax(1);
            state.setRay(incoming);
        }
        lightServer.shadeBakeResult(state, GUI);
        return state;
    }
}

I'll definitely try to have a closer look at this ASAP, maybe during the weekend, and try to match some of the code with https://github.com/fpsunflower/sunflow to get a better idea here.

That is great news - appreciated! But, just a hint: I modified the original code so much that I am not sure if your studying of the original files would be still compatible with my work, tho I have feeling it still would have, as the basics of the BucketRenderer are the same in essence, so if you made something out of it in case it would be no longer compatible with my files I could still try to update it slightly to fit my needs (or at least I hope so).

Or, if you care, I could send you src files of my renderer if you would give me your e-mail or something + if we make deal you will leave it just for yourself tho I know you most probably do not care for it but one never knows(I do not want post it publicly).

(An aside: If this can benefit from the GPU, then you might also consider to just use OpenCL. It has some advantages compared to CUDA (and I mentioned that in the stack overflow answer that I linked to), and may keep it far easier to use, and far more portable).

I am looking for ANY POSSIBLE WAY to use GPU cores instead of CPU ones for computing this thing, so I would take the one that has the easier implementation. Anyway at this moment I have no clue about OpenCL, nor JCuda so it is at the same level for me right now - I just waiting what you would say about it.

from jcuda.

jcuda commented on June 26, 2024

The question about the getRadiance was only a specific example. The key points are:

Where is the bottleneck? and
Can this code path be mapped to the GPU sensibly

For example, when 90% of all time is spent in some tight loop like

for (every pixel(x,y)) {
    pixel[x][y] = getRadiance(x,y);
}

and getRadiance is a complicated method that is offered by the rendering engine and cannot be re-implemented on the GPU, then that's difficult: Even if you speed up all other computations by a factor of 100, you'd reduce the time that is required for rendering a single image from 10 minutes to slightly more than 9 minutes.

My gut feeling is that there is some potential for optimizations even on the CPU, but for that, getting a better idea about the performance-critical paths would be necessary.

Did you already do some profiling run, maybe with jVisualVM or JDK Mission Control, and do you know where most of the time is spent? (Beyond the high-level answer, which likely is: "Exactly the parts that you tried to accelerate by using multiple threads...").

from jcuda.

fafayqa commented on June 26, 2024

So I run the profiling with the jVisualVM, here are the pictures of rather small test render.

This is the final image:

This is the dimensions & quality settings in my BUSUFL:

...and these are the stats from the jVisualVM on CPU usage (order by the highest CPU time) - it took 40 seconds to render with the final HQ settings:

It also looks to me like really the main workload is done in that BusketRenderer class and getRadiance().

from jcuda.

jcuda commented on June 26, 2024

Sorry, I didn't yet have the chance to have a closer look at this during the weekend, but will try to do this ASAP.

Until then: From the screenshot, I wonder why you think that getRadiance is the culprit. When you press "Snapshot" in jVisualVM, then the captured state can be analyzed further. Specifically, there's then a tab at the bottom where you can select a "combined" view (combining the call tree and the hot paths) that might already give a better idea of where the time is spent.

from jcuda.

fafayqa commented on June 26, 2024

Until then: From the screenshot, I wonder why you think that getRadiance is the culprit. When you press "Snapshot" in jVisualVM, then the captured state can be analyzed further. Specifically, there's then a tab at the bottom where you can select a "combined" view (combining the call tree and the hot paths) that might already give a better idea of where the time is spent.

Ah, I did not know that - I used jVisualVM only once before I guess, so there is a lot I still do not know about it - I will have to check that, thanks for the tip!

So, these are the new tests with combined view for each thread and stuff, and now it looks to me that the heaviest computing is done on all sorts of intersecting, but unfortunately, ASAICT, upon checking the code of those classes, there are basically no for() loops that could be paralelised (well, there are some but the nr of loops is like 2, which would add almost no time saving of the render at all I think - right?), thus I guess (having a bad feeling) there's no chance/much worth trying to make it GPU computed, right? :-(

In fact, after all of this talk with you I was able further donwsize the overall rendering time from lets say 5+ hrs for some really complex project (material/shader wise) to something like 3 hrs 45 mins, yet still not enough (as I said I saw Stud.io LEGO editor/renderer from Bricklink using GPU CUDA is able doing the same in about 5 or 10 minutes,. no remember exactly)!

from jcuda.

jcuda commented on June 26, 2024

Sorry for not providing soo much help here, but another hint: You can now "zoom" into the calls, by double-clicking the Hot Spots (NullAccelerator#intersect here).

I already had a short look at this method (because it already had the largest "Self Time" in the first screenshot that you posted), but only glanced at it in the GitHub source code view. What confused me was that it is called Null-Accelerator, and this only does a for-loop: https://github.com/fpsunflower/sunflow/blob/15fa9c6cc6729934181bb877e67f1d1c13679f89/src/org/sunflow/core/accel/NullAccelerator.java#L22 - which means that it does not use any acceleration structure at all!

Browsing through the related code led to https://github.com/fpsunflower/sunflow/blob/15fa9c6cc6729934181bb877e67f1d1c13679f89/src/org/sunflow/PluginRegistry.java#L235 and it appears that there are other acceleration structures possible, but I'll have to examine closer where these are actually used or created - maybe it's possible to let it use one of the other acceleration structures, via some command line parameter or config file, or by selecting it somewhere in the UI of which you posted a screenshot...?

from jcuda.

fafayqa commented on June 26, 2024

Sorry for not providing soo much help here, but another hint: You can now "zoom" into the calls, by double-clicking the Hot Spots (NullAccelerator#intersect here).

OK, will explore that later...thanx for another hint! ;-)

I already had a short look at this method (because it already had the largest "Self Time" in the first screenshot that you posted), but only glanced at it in the GitHub source code view. What confused me was that it is called Null-Accelerator, and this only does a for-loop: https://github.com/fpsunflower/sunflow/blob/15fa9c6cc6729934181bb877e67f1d1c13679f89/src/org/sunflow/core/accel/NullAccelerator.java#L22 - which means that it does not use any acceleration structure at all!

yes, that is correct - I never ever had any reason to dig into this parts of code as I never needed to before + YES< THERE ARE OTHER ACCELERATION OPTIONS as it seems (see below).

Browsing through the related code led to https://github.com/fpsunflower/sunflow/blob/15fa9c6cc6729934181bb877e67f1d1c13679f89/src/org/sunflow/PluginRegistry.java#L235 and it appears that there are other acceleration structures possible, but I'll have to examine closer where these are actually used or created - maybe it's possible to let it use one of the other acceleration structures, via some command line parameter or config file, or by selecting it somewhere in the UI of which you posted a screenshot...?

Well, I can actually hardcode it into the code itself to use specific accelerator as I see no reason to add this option to BUSUFL GUI as only the quickest one will be used once for all, you know, at least for testing, if you will (tho I have no clue what it would do, let's see - remember: this is the original Sunflow code, not mine addition nor changed by me in any significant way so far, so some experimentation will be needed).

This is the screenshot of available accelerator classes + as I searched in the code for usage of the NullAccelerator, I found out there is actually only one place where it is decided which one to use - class "AccelerationStructureFactory", so I can simly edit that to some exact value for testing, and that would be all to it as of changing to some other accel mode, I guess.

UPDATE

Already did some simple quick tests: it seems one cannot use any of the accelerator "just-like-that" on his whim, when I did that, every single one ended up in NullPointerException error. It seems like it is mainly based on n value. The only other accelerator that worked was KDTree BUT only in cases when I set it to auto mode, where it is decided which one to uses depending exactly on that n value.

Below, I added println() to check which one is used for a very simle test project, like 8 bricks.

from jcuda.

jcuda commented on June 26, 2024

OK, I tried out sunflower now actually, and ... darn, that's some cool application. I'm surprised that I wasn't aware of that, considering that it has been there for at least 15 years. I ran some of the examples, and they ran out of the box, just so: "clone - compile - start". (Imagine that was written in a different programming language than Java - reviving that could be a really daunting task...)

I also did a quick test with the jVisualVM Profiler (instead of the Sampler), and it may allow zooming more closely to the hotspots, but at a tremendous overhead, of course). Where exactly most of the time is spent depends on ... well, the actual scene, because there are all these specialized classes for Image Based Lighting or different Shaders and all that.

Is it possible to share one of the scenes (.sc files) that you are testing with, or are they incompatible with the master state of sunflower?

from jcuda.

fafayqa commented on June 26, 2024

OK, I tried out sunflower now actually, and ... darn, that's some cool application. I'm surprised that I wasn't aware of that, considering that it has been there for at least 15 years. I ran some of the examples, and they ran out of the box, just so: "clone - compile - start". (Imagine that was written in a different programming language than Java - reviving that could be a really daunting task...)

:-))))) Welcome in da club then! In its time, considering renderingtime/quality, there were nothing even slightly comparable with Sunflow (freeware-wise, of course), well at least in the virtual LEGO builds

I also did a quick test with the jVisualVM Profiler (instead of the Sampler), and it may allow zooming more closely to the hotspots, but at a tremendous overhead, of course). Where exactly most of the time is spent depends on ... well, the actual scene, because there are all these specialized classes for Image Based Lighting or different Shaders and all that.

Actually tho I am not the original author i would dare to tell you TRUST ME when I am telling you what you see in the jVisualVM is basically standard, that is in all sorts of possible renders those classes that seems to be most accessed would be also the most accessed nevermind what scene (in fact MODEL, not scene itself) one use. Or - to make me some sorts of alliby in case I am wrong, haha - at least in the virtual LEGO world.

Is it possible to share one of the scenes (.sc files) that you are testing with, or are they incompatible with the master state of sunflower?

As I already told you, MY APPLICATION DEVIATED A LOT FROM THE ORIGINAL SUNFLOW in that sense that I edited a lot of its classes and also added tons of mine that interacts with the original code heavilly + created totally new unique minimalistic GUI, sooooo, I would have to send you my src code, and for that you would need to give me your e-mail. Or, in case you are really interested in this thing a lot, we can communicate privatelly on, hmm, let's say, IG (cos that is the only socnet I am using)? Cos my up furthermore extended .sc format (now it is called .bsfm, native file for my BUSUFL - basically the same kind of structured text file), thus to have it any sense for us both, we would need to deal directly with the BUSUFL files, NOT original Sunflow (it is up to you now, of course, I do not want to bother you) - for instance, with original code you would get cometelly BLACK GLASSES, it was me who actually repaired the code + tons of other finetunings. I also mixed some features from 0.73 into 0.72 which was used originally, like transparent PNG for output image (I needed to do a bit of hacks in the code so that I would not need to implement the whole new 0.73 code which would mean complete overdone of the code, buh) + added zillions of my own new features not present at all in the Sunflow.
BTW the complete actual scene is always made on runtime virtually from the LEGO LDD .lxf file (actual virtual LEGO model) and then send to the core engine - the bsfm file (ex .sc file) contains just stuff about camera, light types, image dimensions and such.

from jcuda.

jcuda commented on June 26, 2024

You mentioned that you changed a lot in sunflower, but it was not clear whether this applied ("only" or "mainly") to the rendering core, or the UI, or also to the file format, or ... to all of that (and apparently, it's "all of that" ;-))

One way of sharing your project could be to put it into a private GitHub repository: When you create a repository here on GitHub, you can explicitly set it to be "private" on creation, and then, nobody can view it, except for people who you explicitly invite as "collaborators".

But if this is a project that you can pack into a ZIP file so that I can locally unpack (and compile+start it, maybe with some simple example scene that I can run the profiler on), then you can also send it to jcuda AT jcuda.org.

from jcuda.

fafayqa commented on June 26, 2024

You mentioned that you changed a lot in sunflower, but it was not clear whether this applied ("only" or "mainly") to the rendering core, or the UI, or also to the file format, or ... to all of that (and apparently, it's "all of that" ;-))

Your last assumption is very correct: it is all of that + some more even (as I also packed my app with other separate app-like "extensions" having nothing to do with the renderer itself, like app for creating YOUR OWN LDD bricks, normally unthinkable). :-)

One way of sharing your project could be to put it into a private GitHub repository: When you create a repository here on GitHub, you can explicitly set it to be "private" on creation, and then, nobody can view it, except for people who you explicitly invite as "collaborators".

nah, I would never ever just let my files go on any kind of webpage be it GitHUb private or anything: I am ex-webdesigner, so I know thing or two about this stuff (never ever willingly, even more when it is not public app, you know).

But if this is a project that you can pack into a ZIP file so that I can locally unpack (and compile+start it, maybe with some simple example scene that I can run the profiler on), then you can also send it to jcuda AT jcuda.org.

Agreed - this will be the case for us. BTW I will have to explain thing or two what to do before you can use it, if you are willing to follow? :-)

BTW it is NetBeans project, so it would be easiest/most ideal if you also have it installed.

Oh, and in case you could obtain yourself Windows 7 x64 SP1, that would be absolutely most ideal (as I heard there are some strange problems with the JAVA under Win10 - the worst OS Microsoft produced ever)...like I still run everything on Win7 x64, even my recording studio (most stabile and tweakable OS from MS ever done).

Also I hope you don't mind deleting this complete thread as soon as I send the files to you by e-mail, hm?

UPDATE

I just found one very possible JCuda candidate for "paralelization" in the code for you; it is method diffuse() in ShadingState.java class which is accessed quite a lot during the render. In there, there is a for() loop, where each sequence is like 6+ loops long for a very simple few bricks long model, but in tens (like 40, 90 etc.) for a bigger one (thus i see quite a timesaver potential here), and it basically multiplies and add color values in each loop, see:

public final Color diffuse(Color diff, BusuflGUI GUI) {
	// integrate a diffuse function
	Color lr = Color.black();
	if (diff.isBlack()) {
		return lr;
	}
	for (LightSample sample : this) {
		lr.madd(sample.dot(n), sample.getDiffuseRadiance());
	}
	lr.add(getIrradiance(diff, GUI));
	return lr.mul(diff).mul(1.0f / (float) Math.PI);
}

With that for() loop Color.madd() further explained would be:

for (LightSample sample : this) {
    // lr is Color
    // sample.dot(n) is float - this calls shadowRay.dot(v) method, which translates to: dx * v.x + dy * v.y + dz * v.z (d* = floats, v = Vector3 ---> *.x/*.y/*.z = floats)
    // sample.getDiffuseRadiance() is Color - this just returns already generated value, that is no additional computing needed, it is simple getter
    lr.r += (sample.dot(n) * sample.getDiffuseRadiance().r);
    lr.g += (sample.dot(n) * sample.getDiffuseRadiance().g);
    lr.b += (sample.dot(n) * sample.getDiffuseRadiance().b);
}

And I guess lr.add() and lr.mull() could also be included as they too compute color by separate manipulating of r/g/b values (multiplying, adding, dividing etc. - that kind of stuff) - so basically (possibly) this whole method could be converted to JCuda GPU processing, right?

from jcuda.

jcuda commented on June 26, 2024

On the one hand, I understand the hesitation to put things on a "website" like GitHub. But honestly: Multi-Billion dollar companies are storing their most valuable assets on GitHub. Of course, there could always be a security breach. But using that as a reason to not put stuff on GitHub, and at the same time, sending the same stuff to some random stranger does not make sooo much sense. (I won't share it. Unless someone breaks into my home and carries away my 30-pound-desktop PC, this will stay here, forever).

I don't exactly know what the Windows7/Windows10 issues referred to. From my experience, Java programs always just work (with the exception of native libraries, which always cause an UnsatisfiedLinkError).

Also, I don't intend to delete this thread. (I considered moving that elsewehere, because it's unrelated to the original issue, but I don't see a point in deleting it. It is already stored and mirrored on other sites anyhow...

But if you send me the project, I'll have a closer look (but of course, I can not make any strong committments or promises until now...)

from jcuda.

fafayqa commented on June 26, 2024

On the one hand, I understand the hesitation to put things on a "website" like GitHub. But honestly: Multi-Billion dollar companies are storing their most valuable assets on GitHub. Of course, there could always be a security breach. But using that as a reason to not put stuff on GitHub, and at the same time, sending the same stuff to some random stranger does not make sooo much sense. (I won't share it. Unless someone breaks into my home and carries away my 30-pound-desktop PC, this will stay here, forever).

Not true: you are not some stranger to me in the sense you are author of JCuda, so you are not that anonymous someone, but I got your point. Still, I can put myself behind the reasoning that chance of leaking my code from your PC, be it you would be hacked or you being not sincere and post it somewhere is still for me zillion times less probable than option that something goes wrong and my code would be exposed via Github - principle is quite clear here...well, at least for me. :-) + it is not the case of me thinking my code is worth billions (not at all), the point here is you as a person having my files in your personal PC is basically anonymous to any leakage of my code, well, unless as I said you decide to leak it yourself which I do not believe is the case as I am absolutelly no one to you and also I think the specifics of my app are such that no one outside LEGO community would care anyway (tho one never knows) and they have better - FROM THEIR POINT OF VIEW - options these days, if one is not too demanding (which I am and therefore I made my app for myself). Hope this clears it for you a bit.

I don't exactly know what the Windows7/Windows10 issues referred to. From my experience, Java programs always just work (with the exception of native libraries, which always cause an UnsatisfiedLinkError).

I do not know specifics of the problem too, I am just reporting what I read over at LEGO forums when I was active there...

Also, I don't intend to delete this thread. (I considered moving that elsewehere, because it's unrelated to the original issue, but I don't see a point in deleting it. It is already stored and mirrored on other sites anyhow...

No problem with that: my point is it looks like a flooding as we are talking way off of the initial question now.

But if you send me the project, I'll have a closer look (but of course, I can not make any strong committments or promises until now...)

That is absolutely OK, all I am asking for is that what I already said: just let the code for yourself and that's it. :-)

from jcuda.

JCuda JNvrtc-11.2.0-windows-x86_64.dll dependencies about jcuda HOT 22 CLOSED

Comments (22)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent