Git Product home page Git Product logo

Comments (9)

xuhuisheng avatar xuhuisheng commented on June 30, 2024 1

So it said one of rocm-libs isn't built for gfx902.
square run properly, means you have only APU, not with GPU, I guess.
Next step is test rocm-libs one by one, to find wich component need rebuild for gfx902.

from rocm-build.

xuhuisheng avatar xuhuisheng commented on June 30, 2024

ROCm said they cannot support APU now.
Somebody said there are issues when using APU and GPU at the same time.
You can have a try.

from rocm-build.

delijati avatar delijati commented on June 30, 2024

I tried i even build tensorflow-upstream but is still get:

❯ ../../env/bin/python 02-Clustering.py
GENERATING EMBEDDING FOR: ATL_X
/home/foo/.cache/yay/hip-rocclr/src/HIP-rocm-4.3.1/rocclr/hip_code_object.cpp:486: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
[1]    24231 abort (core dumped)  ../../env/bin/python 02-Clustering.py
../../env/bin/python 02-Clustering.py  3,06s user 4,08s system 141% cpu 5,056 total

from rocm-build.

xuhuisheng avatar xuhuisheng commented on June 30, 2024

we can use AMD_LOG_LEVEL=6 to print out more logs.

from rocm-build.

delijati avatar delijati commented on June 30, 2024
$ AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py
GENERATING EMBEDDING FOR: ATL_X
:3:rocdevice.cpp            :430 : 1885913346 us: Initializing HSA stack.
:3:comgrctx.cpp             :33  : 1885933593 us: Loading COMGR library.
:3:rocdevice.cpp            :196 : 1885936584 us: Numa selects cpu agent[0]=0x5568b74df830(fine=0x5568bb072be0,coarse=0x5568bad2bcf0, kern_arg=0x5568bb6f3f90) for gpu agent=0x7fa4db72ab34
:3:rocdevice.cpp            :1562: 1885937163 us: HMM support: 0, xnack: 0

:4:rocdevice.cpp            :1858: 1885937272 us: Allocate hsa host memory 0x7fa4e0002000, size 0x28
:4:rocdevice.cpp            :1858: 1885937696 us: Allocate hsa host memory 0x7fa460600000, size 0x101000
:4:rocdevice.cpp            :1858: 1885937997 us: Allocate hsa host memory 0x7fa460400000, size 0x101000
:4:runtime.cpp              :82  : 1885938102 us: init
:1:hip_code_object.cpp      :456 : 1885938529 us: hipErrorNoBinaryForGpu: Unable to find code object for all current devices!
:1:hip_code_object.cpp      :458 : 1885938540 us:   Devices:
:1:hip_code_object.cpp      :460 : 1885938542 us:     amdgcn-amd-amdhsa--gfx902:xnack- - [Not Found]
:1:hip_code_object.cpp      :465 : 1885938543 us:   Bundled Code Objects:
:1:hip_code_object.cpp      :482 : 1885938544 us:     host-x86_64-unknown-linux - [Unsupported]
:1:hip_code_object.cpp      :479 : 1885938546 us:     hipv4-amdgcn-amd-amdhsa--gfx1030 - [code object v4 is amdgcn-amd-amdhsa--gfx1030]
:1:hip_code_object.cpp      :479 : 1885938547 us:     hipv4-amdgcn-amd-amdhsa--gfx803 - [code object v4 is amdgcn-amd-amdhsa--gfx803]
:1:hip_code_object.cpp      :479 : 1885938549 us:     hipv4-amdgcn-amd-amdhsa--gfx900:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx900:xnack-]
:1:hip_code_object.cpp      :479 : 1885938550 us:     hipv4-amdgcn-amd-amdhsa--gfx906:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx906:xnack-]
:1:hip_code_object.cpp      :479 : 1885938552 us:     hipv4-amdgcn-amd-amdhsa--gfx908:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx908:xnack-]
:1:hip_code_object.cpp      :479 : 1885938553 us:     hipv4-amdgcn-amd-amdhsa--gfx90a:xnack+ - [code object v4 is amdgcn-amd-amdhsa--gfx90a:xnack+]
:1:hip_code_object.cpp      :479 : 1885938555 us:     hipv4-amdgcn-amd-amdhsa--gfx90a:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx90a:xnack-]
/home/foo/.cache/yay/hip-rocclr/src/HIP-rocm-4.3.1/rocclr/hip_code_object.cpp:486: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
[1]    17615 abort (core dumped)  AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py
AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py  2,52s user 3,90s system 141% cpu 4,544 total

from rocm-build.

delijati avatar delijati commented on June 30, 2024

At least i got HIP running:

❯ AMD_LOG_LEVEL=6 ./square.out
:3:rocdevice.cpp            :430 : 10625438911 us: Initializing HSA stack.
:3:comgrctx.cpp             :33  : 10625460831 us: Loading COMGR library.
:3:rocdevice.cpp            :196 : 10625465529 us: Numa selects cpu agent[0]=0x205e2a0(fine=0x20f1f80,coarse=0x20f7560, kern_arg=0x210f8d0) for gpu agent=0x7fcbce7c3b34
:3:rocdevice.cpp            :1562: 10625466470 us: HMM support: 0, xnack: 0

:4:rocdevice.cpp            :1858: 10625466635 us: Allocate hsa host memory 0x7fcbcea34000, size 0x28
:4:rocdevice.cpp            :1858: 10625467138 us: Allocate hsa host memory 0x7fcbcd400000, size 0x101000
:4:rocdevice.cpp            :1858: 10625467587 us: Allocate hsa host memory 0x7fcbcd200000, size 0x101000
:4:runtime.cpp              :82  : 10625467659 us: init
:3:hip_device.cpp           :239 : 10625467704 us: 30526: [7fcbcddfb540] hipGetDeviceProperties: Returned hipSuccess : 
info: running on device Cezanne
info: allocate host mem (  7.63 MB)
info: allocate device mem (  7.63 MB)
:3:hip_memory.cpp           :384 : 10625470790 us: 30526: [7fcbcddfb540] hipMalloc ( 0x7ffca2147320, 4000000 )
:4:rocdevice.cpp            :1993: 10625470946 us: Allocate hsa device memory 0x7fcbcc400000, size 0x3d0900
:3:rocdevice.cpp            :2032: 10625470952 us: device=0x211d4b0, freeMem_ = 0xffc2f700
:3:hip_memory.cpp           :386 : 10625470960 us: 30526: [7fcbcddfb540] hipMalloc: Returned hipSuccess : 0x7fcbcc400000: duration: 170 us
:3:hip_memory.cpp           :384 : 10625470964 us: 30526: [7fcbcddfb540] hipMalloc ( 0x7ffca2147318, 4000000 )
:4:rocdevice.cpp            :1993: 10625471018 us: Allocate hsa device memory 0x7fcbc0800000, size 0x3d0900
:3:rocdevice.cpp            :2032: 10625471026 us: device=0x211d4b0, freeMem_ = 0xff85ee00
:3:hip_memory.cpp           :386 : 10625471032 us: 30526: [7fcbcddfb540] hipMalloc: Returned hipSuccess : 0x7fcbc0800000: duration: 68 us
info: copy Host2Device
:3:hip_memory.cpp           :429 : 10625471065 us: 30526: [7fcbcddfb540] hipMemcpy ( 0x7fcbcc400000, 0x7fcbcce2f010, 4000000, hipMemcpyHostToDevice )
:3:rocdevice.cpp            :2543: 10625471616 us: number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :2618: 10625478601 us: created hardware queue 0x7fcbcd5f5000 with size 1024 with priority 1, cooperative: 0
:4:rocdevice.cpp            :1858: 10625478822 us: Allocate hsa host memory 0x7fcbcc980000, size 0x80000
:3:devprogram.cpp           :2466: 10625710710 us: Using Code Object V4.
:4:command.cpp              :303 : 10625712610 us: command is enqueued: 0x214adc0
:4:command.cpp              :262 : 10625712653 us: queue marker to command queue: 0x20f1b20
:4:command.cpp              :303 : 10625712656 us: command is enqueued: 0x205e500
:4:command.cpp              :222 : 10625712657 us: waiting for event 0x214adc0 to complete, current status 3
:4:commandqueue.cpp         :176 : 10625713048 us: command (CopyHostToDevice) is submitted: 0x214adc0
:4:rocvirtual.hpp           :200 : 10625713254 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46b00
:4:rocvirtual.hpp           :228 : 10625713263 us: [7fcbcd562640]!	WaitNext completion_signal=0x7fcbcea46a80
:4:rocblit.cpp              :670 : 10625713266 us: [7fcbcd562640]!	HSA Asycn Copy wait_event=0x0, completion_signal=0x7fcbcea46b00
:4:rocvirtual.hpp           :200 : 10625713701 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46b00
:4:rocvirtual.cpp           :449 : 10625713708 us: [7fcbcd562640]!	Host wait on completion_signal=0x7fcbcea46b00
:4:commandqueue.cpp         :176 : 10625714354 us: command (InternalMarker) is submitted: 0x205e500
:4:rocvirtual.hpp           :200 : 10625714368 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46b00
:4:command.cpp              :236 : 10625714371 us: event 0x214adc0 wait completed
:4:command.cpp              :152 : 10625714372 us: Command 0x214adc0 complete
:4:command.cpp              :152 : 10625714374 us: Command 0x205e500 complete
:3:hip_memory.cpp           :432 : 10625714379 us: 30526: [7fcbcddfb540] hipMemcpy: Returned hipSuccess : : duration: 243314 us
info: launch 'vector_square' kernel
:3:hip_platform.cpp         :202 : 10625714411 us: 30526: [7fcbcddfb540] __hipPushCallConfiguration ( {512,1,1}, {256,1,1}, 0, stream:<null> )
:3:hip_platform.cpp         :206 : 10625714419 us: 30526: [7fcbcddfb540] __hipPushCallConfiguration: Returned hipSuccess : 
:3:hip_platform.cpp         :213 : 10625714430 us: 30526: [7fcbcddfb540] __hipPopCallConfiguration ( {34542240,0,34538320}, {3458397079,32715,18}, 0x7ffca2147330, 0x7ffca2147328 )
:3:hip_platform.cpp         :222 : 10625714433 us: 30526: [7fcbcddfb540] __hipPopCallConfiguration: Returned hipSuccess : 
:3:hip_module.cpp           :489 : 10625714444 us: 30526: [7fcbcddfb540] hipLaunchKernel ( 0x401c10, {512,1,1}, {256,1,1}, 0x7ffca2147370, 0, stream:<null> )
:3:devprogram.cpp           :2466: 10625714623 us: Using Code Object V4.
:3:hip_module.cpp           :358 : 10625715521 us: 30526: [7fcbcddfb540] ihipModuleLaunchKernel ( 0x0x21577a0, 131072, 1, 1, 256, 1, 1, 0, stream:<null>, 0x7ffca2147370, char array:<null>, event:0, event:0, 0, 0 )
:4:command.cpp              :303 : 10625715595 us: command is enqueued: 0x215f780
:3:hip_platform.cpp         :638 : 10625715619 us: 30526: [7fcbcddfb540] ihipLaunchKernel: Returned hipSuccess : 
:3:hip_module.cpp           :491 : 10625715635 us: 30526: [7fcbcddfb540] hipLaunchKernel: Returned hipSuccess : 
info: copy Device2Host
:3:hip_memory.cpp           :429 : 10625715651 us: 30526: [7fcbcddfb540] hipMemcpy ( 0x7fcbcca5e010, 0x7fcbc0800000, 4000000, hipMemcpyDeviceToHost )
:4:command.cpp              :303 : 10625715657 us: command is enqueued: 0x214adc0
:4:command.cpp              :262 : 10625715660 us: queue marker to command queue: 0x20f1b20
:4:command.cpp              :303 : 10625715661 us: command is enqueued: 0x2157db0
:4:command.cpp              :222 : 10625715662 us: waiting for event 0x214adc0 to complete, current status 3
:4:commandqueue.cpp         :176 : 10625715663 us: command (KernelExecution) is submitted: 0x215f780
:3:rocvirtual.cpp           :603 : 10625715679 us: !	arg0:   = ptr:0x7fcbc0800000 obj:[0x7fcbc0800000-0x7fcbc0bd0900] threadId : 7fcbcd562640
:3:rocvirtual.cpp           :603 : 10625715685 us: !	arg1:   = ptr:0x7fcbcc400000 obj:[0x7fcbcc400000-0x7fcbcc7d0900] threadId : 7fcbcd562640
:3:rocvirtual.cpp           :2560: 10625715689 us: [7fcbcd562640]!	ShaderName : _Z13vector_squareIfEvPT_S1_m
:4:rocvirtual.cpp           :753 : 10625715723 us: [7fcbcd562640] HWq=0x7fcbcd5f5000, Dispatch Header = 0x502 (type=2, barrier=1, acquire=2, release=0), setup=3, grid=[131072, 1, 1], workgroup=[256, 1, 1], private_seg_size=0, group_seg_size=0, kernel_obj=0x7fcbc0408840, kernarg_address=0x7fcbcc980000, completion_signal=0x0
:4:commandqueue.cpp         :176 : 10625715741 us: command (CopyDeviceToHost) is submitted: 0x214adc0
:4:rocvirtual.hpp           :200 : 10625717359 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46a80
:4:rocvirtual.hpp           :228 : 10625717373 us: [7fcbcd562640]!	WaitNext completion_signal=0x7fcbcea46a00
:4:rocvirtual.cpp           :871 : 10625717384 us: [7fcbcd562640] HWq=0x7fcbcd5f5000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7fcbcea46a80
:4:rocvirtual.hpp           :200 : 10625717406 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46a00
:4:rocvirtual.hpp           :228 : 10625717414 us: [7fcbcd562640]!	WaitNext completion_signal=0x7fcbcea46980
:4:rocblit.cpp              :670 : 10625717420 us: [7fcbcd562640]!	HSA Asycn Copy wait_event=0x0, completion_signal=0x7fcbcea46a00
:4:rocvirtual.hpp           :200 : 10625718851 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46a00
:4:rocvirtual.cpp           :449 : 10625718897 us: [7fcbcd562640]!	Host wait on completion_signal=0x7fcbcea46a00
:4:commandqueue.cpp         :176 : 10625719926 us: command (InternalMarker) is submitted: 0x2157db0
:4:rocvirtual.hpp           :200 : 10625719960 us: [7fcbcd562640]!	WaitCurret completion_signal=0x7fcbcea46a00
:4:command.cpp              :152 : 10625719969 us: Command 0x215f780 complete
:4:command.cpp              :152 : 10625719978 us: Command 0x214adc0 complete
:4:command.cpp              :152 : 10625719981 us: Command 0x2157db0 complete
:4:command.cpp              :236 : 10625719984 us: event 0x214adc0 wait completed
:3:hip_memory.cpp           :432 : 10625720011 us: 30526: [7fcbcddfb540] hipMemcpy: Returned hipSuccess : : duration: 4360 us
info: check result
PASSED!

from rocm-build.

xuhuisheng avatar xuhuisheng commented on June 30, 2024

2 months after last posts, I will close this issue, please reopen if there is any updates.

from rocm-build.

jf-horton avatar jf-horton commented on June 30, 2024

@delijati I've been trying to get HIP running on a gfx902, but have had no luck. What version of rocm did you use? Did you have to build from source?

from rocm-build.

mzimmerm avatar mzimmerm commented on June 30, 2024

@delijati I have, in the last 2 weeks, spend significant time trying to get Pytorch on ROCm on gfx902 running. I experimented with different Linux versions but mostly Ubuntu, different ROCm / AMDGPU versions, with no luck. The few combinations I got working generally end with this error:
"HIP error: shared object initialization failed.
I am now declaring it a failure and impossibility, and for ML/DL testing getting a video card that does not use ROCm, as official ROCm support is about 6 cards today, no APUs.

from rocm-build.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.