Git Product home page Git Product logo

Comments (4)

Robadob avatar Robadob commented on August 29, 2024

Have just found CU_JIT_GLOBAL_SYMBOL enums, which looks like they might be a less dynamic way of achieving this.


I've modified jitify::experimental::Kernel so that I can pass in arguments nopts, opts, optsvals to the constructor of jtifiy::detail::CUDAKernel.

On passing the 3 CU_JIT_GLOBAL_SYMBOL options, the call to cuModuleLoadDataEx() within the constructor throws CUDA_ERROR_NOT_SUPPORTED (I've added checks to ensure it's not coming from a previous call).

This is not documented as a possible error from that method, so I'm not sure if their lack of support is undocumented or I've done something incorrectly.

I've tested this using Windows, Visual Studio 2015 & 2017, CUDA 10.2, Titan X Pascal (Building for sm_61).

My changes to the above example, and modified Jitify header can be found here, potentially of note I forked the header from the hotfix/more-msvc-issues branch.

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

Unfortunately it is not possible for runtime code to gain access to symbols in the host program.

Jitify does support two related things that may be useful:

  1. Linking external device libraries using -Lpath and -lname options. We could potentially extend this to support linking with ptx/cubin/object files as well if that would be useful.
  2. Accessing __device__ and __constant__ variables declared inside the runtime code via kernel_instantiation.get_constant_ptr() (we will be extending this to support accessing __device__ variables as well soon).

I'm not very familiar with the CU_JIT_GLOBAL_SYMBOL flags, but it looks like they're used to make extern symbols in the runtime code use host addresses instead of looking for device symbols during linking. Is there a reason this is useful for your application vs. say using non-extern symbols in your runtime code and accessing them via get_constant_ptr?

from jitify.

Robadob avatar Robadob commented on August 29, 2024

I'm working with @mondus.

We're developing a new framework for developing complex system simulations, whilst abstracting away most of the CUDA (FLAMEGPU2, to improve on usability over FLAMEGPU). He's also interested in developing a runtime compiled Python interface to the library, to make it more accessible, which is where this comes in. (I'm not sure if he told you this much the other day)

Accessing __device__ and __constant__ variables ...

Several common reused items/pointers are stored behind __device__ and __constant__ variables, hence automatic linking with the executing runtime would be convenient. There are several workarounds, such as reducing the number of symbols (by packing items inside a struct) and then manually setting the symbols inside the RTC units as each is compiled (using get_constant_ptr or similar, I hadn't noticed this, but it will require some changes to reduce the number of items we need to set).

One potential downside of this would be that duplicating symbols would inflate constant cache usage, as we're using a large portion of constant cache for a read-only hashtable central to accessing variables within models. For potentially large models with many RTC device functions, this could lead to capacity issues.

Edit: Given we'd be duplicating part/all of that hash table, this could actually fill up constant cache very fast. (The hash table is currently given half the constant cache, simply because it can't be scaled at runtime. We'd need to do some parsing of the RTC code to adapt the size per kernel presumably. This obviously requires some more though on our part.)

Edit2: I suppose we could actually not have the hash table in the main runtime (for a special Python/RTC build), and cat all of the RTC code so it builds as a single unit so we're not duplicating constant memory!

But that's a long way off at the stage, so I don't think it's a killer problem.

Linking external device libraries using -Lpath and -lname options. ...

At this point I'm not sure how viable it would be to load the entire library from PTX at runtime, as from my understanding that would necessitate require changing large parts of the codebase that interact with CUDA.

I'm assuming, if we built the library a second time to ptx/cubin, and linked against that, it wouldn't also link against the runtime. So your second suggestion is likely the way to go.

Edit 3: I suppose the more technically adept approach would be to combined your suggestions, and build a users RTC kernel's separately and link them to achieve similar to edit 2 above.

Thanks for the advice.

from jitify.

Robadob avatar Robadob commented on August 29, 2024

@benbarsdell
P.S. could you please add #include <cctype> to jitify.hpp, it doesn't build without this under Visual Studio 2015, due to isdigit and isalpha (removing std:: from them also works but that's probably a less portable solution).

Seems too minor of an issue for it's own ticket, (and the fact it's a deprecated IDE).

from jitify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.