Git Product home page Git Product logo

Comments (10)

httpdigest avatar httpdigest commented on August 21, 2024

Thanks for proposing Yeppp! I did not know about that, yet.
I will benchmark it, however I have doubts about the usefulness of a non-native language binding (such as Java) and its usefulness for JOML.
Yeppp! itself will surely be great for native applications, no doubt about that! :)
But I had a look at Yeppp!'s API and they provide very very low-level SIMD instructions as JNI methods.
This is likely to be many orders of magnitudes slower than even the corresponding Java scalar code, because of tremendous JNI overhead.
Java scalar floating point arithmetics code will be converted to x86 SSE scalar arithmetics instructions and do not suffer from any method/JNI overhead.
Yeppp! now has the overhead of a JNI function call itself, but then they also suffer from their representation of SIMD vector types, which is float[] arrays, which is bound to use GetFloatArrayElements/ReleaseFloatArrayElements JNI functions to reliably access those float arrays, so they must call repeatedly into the JVM for every single SIMD instruction.
Even the cost for a single native function invocation will outweigh the gains of a SIMD instruction by some orders of magnitudes.
But I will test it!
What I wish for, though, would be a class provided by the Java Runtime Environment so that the JVM knows to convert those methods into SSE intrinsics when JIT'ting the JVM bytecode into x86. :)
But again, thanks for mentioning it. I will test it.

from joml.

httpdigest avatar httpdigest commented on August 21, 2024

Okay, here are the numbers:
Yeppp!'s Java binding is about 104 times slower than corresponding scalar Java code working on float[] arrays, and about 110 times slower than JOML using Vector4f.add invocations working on the primitive float x, y, z, w fields!
The functions that Yeppp! provides are just too primitive, meaning that they do not do a lot of work.

from joml.

httpdigest avatar httpdigest commented on August 21, 2024

Just one thought I had recently:
What "could" be done however to increase the "arithmetic density" and therefore hide the latency of JNI, is to use the Matrix4f as some sort of "builder". Invoking Matrix4f methods could then be lazily "journaled" internally as long as one does not need any intermediate result (terminal operation) of the matrix operations (with Matrix4f.get(FloatBuffer) being such a "terminal operation") and then eventually JIT-generate SSE code with a small support library.
A "chain" of Matrix4f method invocations like perspective().lookAt().rotate().mul(matrix).invert().get(fb) would then be translated to a single JIT'ted native function.
Now the only problem we have is: How do we get the actual arguments of the various Matrix4f methods to be accessible by JNI, and how do we access the matrix elements?
An actual access of the 16 float fields would likely be very slow. There could be the possibility of accumulating float arguments into a direct ByteBuffer and then jit code that accesses the correct offset into that ByteBuffer.
That "could" at some break-even point be faster than Java code, if just enough arithmetics are involved in it.

from joml.

eix128 avatar eix128 commented on August 21, 2024

so to utilize yeppp , we need to send bulk matrix data , not single 4x4 matrix , right?

from joml.

eix128 avatar eix128 commented on August 21, 2024

for 1024x1024 matrix operations ? yeppp also bad ?

from joml.

httpdigest avatar httpdigest commented on August 21, 2024

Have a look at #30. There I outlined what needs to be done to make at least use of native SSE code in JOML.

from joml.

eix128 avatar eix128 commented on August 21, 2024

there can be automatically benchmark at pc runtime and choose which method to choose.
JNI or JIT.

public static void initMatrixStage(int columnPreffered[],int rowPreffered[],boolean bulksendOperation,int benchmarkTimeOutInMonths) {
int benchmarkSize = columnPreffered.length;
if(columnPreffered.length != rowPreffered.length) {
throw new MatrixBenchmarkException("matrix sizes invalid.");
}
for(int j = benchmarkSize ; j >= 0 ; j--) {
//benchmark jit and get time consumption
//benchmark yeppp jni solution
//benchmark for AMD APARAPI solution
//benchmark for ArrayFire Java API solution
//choose best solution and save it to disk at init stage for column and row matrix op
}
}

from joml.

eix128 avatar eix128 commented on August 21, 2024

Do you know AMD APARAPI ?
http://developer.amd.com/tools-and-sdks/opencl-zone/aparapi/

from joml.

 avatar commented on August 21, 2024

@jduke32 Using a Java binding for the OpenCL API like JogAmp JOCL would give a lot more flexibility than Aparapi.

from joml.

httpdigest avatar httpdigest commented on August 21, 2024

Closed because YEPPP! is not relevant for JOML.

from joml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.