Git Product home page Git Product logo

Comments (7)

benbarsdell avatar benbarsdell commented on August 29, 2024

No, currently they are not shared, each kernel instantiation has its own cuModule, so the addresses will be different (I confirmed with a test).

This is arguably a design flaw in the Jitify API, and I'd been wondering if/when it would become a problem. I'd be interested to know how important it is for your application.

A (hypothetical) new Jitify API that better matched the underlying CUDA APIs would allow (/require) you to provide multiple name expressions for a single program (e.g., template instantiations of multiple kernels, globals etc.), then compile it once to a single module and extract all of the kernels and global addresses. This is doable, but would take a bit of refactoring and would be a slightly less intuitive API for common use-cases. Let us know if you think something like this would be of value.

from jitify.

mondus avatar mondus commented on August 29, 2024

Thanks for the reply @benbarsdell. This is certainly an issue for us, particularly when it comes to constant memory. We have a number of large constant and statically sized device symbols which we can compile within the same unit but which need to be accessed by separate kernels in the same compilation unit. Your suggestion would be very helpful for our use case but also for any use case where there are multiple kernels in the same compilation unit. Would it not be possible to simply change the internals so that the cuModule was created by the program and shared with each kernel object?

We can work around the device symbols but I cant see a clear way to work around our use of constant memory. Although I am unclear if the constant memory limitations are per module/context/device.

from jitify.

maddyscientist avatar maddyscientist commented on August 29, 2024

For the constants, could this be a good use jitify's new found linking ability: declare the __constant__ in the offline source code, e.g., in a .cu file and JIT compile the kernel and link against that object file?

from jitify.

mondus avatar mondus commented on August 29, 2024

@maddyscientist Yes this might work so long as you can link multiple kernels against the same module (containing the constant definition). Presumably this is fine as they are in the same context?

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

I think linking will have the same issue because there will still be multiple modules, unless I'm misunderstanding.

Would it not be possible to simply change the internals so that the cuModule was created by the program and shared with each kernel object?

The problem is that we currently have:

program.kernel(name).instantiate(template args).launch(...)

but what we would need is (roughly speaking):

program.instantiate(list of name expressions).kernel(name expression).launch(...).

In particular, the call to instantiate() is when the program gets compiled. Changing that means changing the fundamental flow of the API. This is doable, but not a small change.

from jitify.

mondus avatar mondus commented on August 29, 2024

@benbarsdell Yes I imagine that you are right as after linking there would be multiple modules with duplicate definitions of the constant. To set the constant value would require doing this for each instaciation. I see now how this would be a significant change (but one which I would very much support!). Could you support both options? E.g.

program.instantiate_program(list of name expressions).instanciated_kernel(name expression).launch(...)

Supposedly this would then support things like.

program.get_global_ptr(...)

Which would solve all of my problems...

What I am currently still unclear on is how constant memory is allocate don the device. The following SO question points to the ISA docs suggesting "There is an additional 640 KB of constant memory, organized as ten independent 64 KB regions. The driver may allocate and initialize constant buffers in these regions and pass pointers to the buffers as kernel function parameters.". Does this mean I could have a maximum of 10 jiffy kernels/modules each using 64KB of constant space, or could I have any number and some driver magic would take take or mapping these to regions at kernel launch?

from jitify.

mondus avatar mondus commented on August 29, 2024

@benbarsdell We have a work around for this for now but it would be a nice feature to enable instantiation of multiple kernels from the same module.

from jitify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.