Git Product home page Git Product logo

Comments (9)

pszi1ard avatar pszi1ard commented on September 26, 2024

Besides an env. var. to tell the runtime which devices to (not) enumerate, how can I tweak the driver to not use one of the devices? As a real-world use-case perhaps I want to use one GPU for display with mesa and the other with rocm...

from hcc.

pszi1ard avatar pszi1ard commented on September 26, 2024

PS: Actually, I'm realizing that this issue is not hcc-specific. The same mechanism is relevant for all ROC runtimes. Should the issue be moved?

from hcc.

fxkamd avatar fxkamd commented on September 26, 2024

I think HCC or HIP has an environment variable for this: HIP_VISIBLE_DEVICES. It's been a while since I looked at that.

from hcc.

pszi1ard avatar pszi1ard commented on September 26, 2024

Thanks for the info. Unfortunately this seems to not address the original issue (#197) where the binary I compile with the current release does not run on a machine where there are two different cards, no matter what I pass in HIP_VISIBLE_DEVICES. I would expect that, even if the compiler can't embed code for multiple target arch, I should be able to pick the right device and still be able to run the binary.

from hcc.

fxkamd avatar fxkamd commented on September 26, 2024

HIP_VISIBLE_DEVICES applies only to HIP applications. If your application doesn't use HIP, the environment variable doesn't have any effect.

AIUI #197 will be fixed in the next release. I'm afraid I don't know any other short-term fix.

from hcc.

scchan avatar scchan commented on September 26, 2024

@pszi1ard For #197 , I believe you have an hcc based on clang 3.5 which doesn't support multiple ISAs in a single binary, could you try installing a newer hcc compiler based on clang 4 which provides that support? You could refer to my comment in #197 on how to get a newer hcc installer package.

from hcc.

pszi1ard avatar pszi1ard commented on September 26, 2024

@fxkamd Thanks for the clarification. I was trying to run the HIP samples (e.g. hipBusBandwidth). It's still not clear to me whether this should work if I e.g. compile with --amdgpu-target=AMD:AMDGPU:8:0:3 and try to run with env HIP_VISIBLE_DEVICES=0 (assuming that device 0 is a Fiji card)?

@scchan Thanks for the feedback. Indeed, as I noted on #197, I do have a clang 3.5-based hcc. I'll report back when I am able to test a newer compiler.

from hcc.

fxkamd avatar fxkamd commented on September 26, 2024

I'm not sure how HCC and HIP enumerate accelerators. SiuChi, do you know if it includes CPUs? Then 0 would be the CPU, 1 would be the first GPU etc. Or if not, then 0 would refer to the first GPU.

from hcc.

pszi1ard avatar pszi1ard commented on September 26, 2024

I di checked both devices 0 and 1 (and now tried few more values in case) and none of them work so the issue seems to be somewhere else:

$ /opt/rocm/hip/bin/hipcc -O3 -g  --amdgpu-target=AMD:AMDGPU:8:0:3 hipBusBandwidth.cpp ResultDatabase.cpp -o hipBusBandwidth

 $ HIP_VISIBLE_DEVICES=1 gdb ./hipBusBandwidth
Reading symbols from ./hipBusBandwidth...done.
(gdb) r
Starting program: /tmp/samples/1_Utils/hipBusBandwidth/hipBusBandwidth 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4875700 (LWP 5963)]
### HCC STATUS_CHECK Error: HSA_STATUS_ERROR_INCOMPATIBLE_ARGUMENTS (0x100d) at file:/home/scchan/code/github/hcc-roc-1.4.x/hcc/lib/hsa/mcwamp_hsa.cpp line:2511

Program received signal SIGABRT, Aborted.
0x00007ffff5f83c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt 
#0  0x00007ffff5f83c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff5f87028 in __GI_abort () at abort.c:89
#2  0x00007ffff488ed31 in Kalmar::HSADevice::BuildOfflineFinalizedProgramImpl(void*, int) () from /opt/rocm/hcc/lib/libmcwamp_hsa.so
#3  0x00007ffff488b6b5 in Kalmar::HSADevice::BuildProgram(void*, void*, bool) () from /opt/rocm/hcc/lib/libmcwamp_hsa.so
#4  0x0000000000460b46 in Kalmar::KalmarBootstrap::KalmarBootstrap() ()
#5  0x00000000004608e7 in __hcc_shared_library_init ()
#6  0x000000000046c32d in __libc_csu_init ()
#7  0x00007ffff5f6eed5 in __libc_start_main (main=0x464f30 <main(int, char**)>, argc=1, argv=0x7fffffffda48, init=0x46c2e0 <__libc_csu_init>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffda38) at libc-start.c:246
#8  0x0000000000460fa1 in _start ()

The behavior is identical for HIP_VISIBLE_DEVICES=0,1,2,3.

from hcc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.