Git Product home page Git Product logo

Comments (12)

benbarsdell avatar benbarsdell commented on August 29, 2024

Thanks for the feedback!

I tried to reproduce the problem but ran into a different issue related to missing headers (which I'll submit a patch for).

Could you tell me what CUDA version you're using and provide a minimal reproducer?
*EDIT: Could you also provide what platform you're running on.

from jitify.

mondus avatar mondus commented on August 29, 2024

Hi @benbarsdell. Thanks for getting back to me. I saw your talk with Kate at GTC when you first introduced jitify and knew I would need it at some point!

I am using CUDA 10.1 on windows using dlfcn-win32 (see #42). A minimal example is to add the line

"#include <curand_kernel.h>\n"

at line 159

Note: This actually works on linux no problem. To reproduce on windows requires building with dlfcn-win32, setting a pre-porcessor macro of NOMINMAX , updating the hard coded include directory and pre-processing (with stringify) the my_header3.cuh file to generate to stringified version. I also had to create a disk version of my_header4.cuh as the callback function didn't seem to work.

I have noticed that one of the later tests also fail on windows. E.g. line 266 fails. Looking at the get_constant_ptr and demangle functions this is because it is expecting the name mangling to be different in visual studio. My windows ptx has the same mangling as in linux however. E.g.

.const .align 4 .u32 a;
.const .align 4 .u32 _ZN1b1aE;
.const .align 4 .u32 _ZN1c1b1aE;

Hence the lookup fails.

from jitify.

maddyscientist avatar maddyscientist commented on August 29, 2024

That's interesting that you are not seeing the expected name mangling on Windows. Which compiler are you using? I guess we need to fix up the demangle detection for this.

from jitify.

mondus avatar mondus commented on August 29, 2024

Hi @maddyscientist it is Visual Studio 2019 with CUDA 10.1. I noticed that the demangle detection is easy to fix by changing a macro guard but the actual demangling would require cxxabi which is not so easy to come by on windows...

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

I was able to reproduce your issues in Visual Studio 2019 with CUDA 10.2, and I have fixes for them. I'll put them into a PR probably tomorrow. I'll also add workarounds for the NOMINMAX and dlfcn problems to avoid those annoyances.

The size_t issue is because jitify provides a definition of it in a built-in header, but NVRTC already provides its own built-in definition, and on Windows these definitions conflict. The fix is to remove these lines:

jitify/jitify.hpp

Lines 1578 to 1579 in 8af928e

"typedef unsigned long size_t;\n"
"typedef signed long ptrdiff_t;\n"

The name mangling issue is because CUDA (PTX) always uses the Itanium mangling scheme even when compiling with Visual Studio. I was able to implement a simple demangler for variable names that avoids needing cxxabi.

With these fixes, all the tests in jitify_example.cpp pass. I didn't see anything related to the callbacks; maybe it was some kind of current-working-directory problem?

from jitify.

mondus avatar mondus commented on August 29, 2024

@benbarsdell Fantastic. Looking forward to looking at your Itanium demangler code! I will take another look at the callbacks issue. Probably user error.

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

The fixes are available in #45. Let me know if anything doesn't work for you.

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

@mondus @Robadob
I added the #include <cctype> fix to #45. Are you OK with it being merged?

from jitify.

Robadob avatar Robadob commented on August 29, 2024

Building with the current status of #45 and VS2015, it works out the box for me. These are the remaining compile warnings I get (I think they're significantly reduce from what Mondus showed me the other week, though I haven't tried it in VS2020).

I haven't tested with Curand, I don't think Mondus shared that code with me. I've sent him an email and will try and chase up if he's in the office today.

Might be worth addressing them. Though, if it blocks our CI I can probably just wrap the include to reduce warning level.

C:\Users\Robadob\Documents\Visual Studio 2015\Projects\Jitify>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc.exe" -gencode=arch=compute_35,code=\"sm_35,compute_35\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64" -x cu -rdc=true -I"C:\Program Files (x86)\dlfcn-win32\include" -I"C:\Program Files (x86)\Visual Leak Detector\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static  -g   -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64\Debug\vc140.pdb /FS /Zi /RTC1 /MDd " -o x64\Debug\kernel.cu.obj "C:\Users\Robadob\Documents\Visual Studio 2015\Projects\Jitify\kernel.cu"
1>c:\program files\nvidia gpu computing toolkit\cuda\v10.1\include\vector_types.h(423): warning : calling a __host__ function("std::_Iterator_base12::_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::_Iterator_base12 [subobject]") is not allowed
1>c:\program files\nvidia gpu computing toolkit\cuda\v10.1\include\vector_types.h(423): warning : calling a __host__ function("std::_Iterator_base12::~_Iterator_base12") from a __host__ __device__ function("std::_Iterator_base12::~_Iterator_base12 [subobject]") is not allowed
1>  kernel.cu
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2259): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2087): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2793): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(3564): warning C4267: 'argument': conversion from 'size_t' to 'unsigned int', possible loss of data
1>c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(217): warning C4800: 'unsigned __int64': forcing value to bool 'true' or 'false' (performance warning)
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(217): note: while compiling class template member function 'bool jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>::contains(const unsigned __int64 &) const'
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2894): note: see reference to function template instantiation 'bool jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>::contains(const unsigned __int64 &) const' being compiled
1>  c:\users\robadob\documents\visual studio 2015\projects\jitify\jitify/jitify.hpp(2357): note: see reference to class template instantiation 'jitify::ObjectCache<unsigned __int64,jitify::ProgramConfig>' being compiled

from jitify.

mondus avatar mondus commented on August 29, 2024

@benbarsdell Sorry for the delay. Other academic duties have moved me away form coding this week. I am sure your fix is fine but I will test this early next week and confirm.

from jitify.

benbarsdell avatar benbarsdell commented on August 29, 2024

No worries, thanks for the feedback so far. I pushed some more fixes for conversion warnings. I'll merge #45 now (I have a follow-up PR to submit) and we can address any other issues that come up next week.

from jitify.

mondus avatar mondus commented on August 29, 2024

This works for me. Cheers.

from jitify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.