Comments (9)
So it said one of rocm-libs isn't built for gfx902.
square run properly, means you have only APU, not with GPU, I guess.
Next step is test rocm-libs one by one, to find wich component need rebuild for gfx902.
from rocm-build.
ROCm said they cannot support APU now.
Somebody said there are issues when using APU and GPU at the same time.
You can have a try.
from rocm-build.
I tried i even build tensorflow-upstream
but is still get:
❯ ../../env/bin/python 02-Clustering.py
GENERATING EMBEDDING FOR: ATL_X
/home/foo/.cache/yay/hip-rocclr/src/HIP-rocm-4.3.1/rocclr/hip_code_object.cpp:486: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
[1] 24231 abort (core dumped) ../../env/bin/python 02-Clustering.py
../../env/bin/python 02-Clustering.py 3,06s user 4,08s system 141% cpu 5,056 total
from rocm-build.
we can use AMD_LOG_LEVEL=6 to print out more logs.
from rocm-build.
$ AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py
GENERATING EMBEDDING FOR: ATL_X
:3:rocdevice.cpp :430 : 1885913346 us: Initializing HSA stack.
:3:comgrctx.cpp :33 : 1885933593 us: Loading COMGR library.
:3:rocdevice.cpp :196 : 1885936584 us: Numa selects cpu agent[0]=0x5568b74df830(fine=0x5568bb072be0,coarse=0x5568bad2bcf0, kern_arg=0x5568bb6f3f90) for gpu agent=0x7fa4db72ab34
:3:rocdevice.cpp :1562: 1885937163 us: HMM support: 0, xnack: 0
:4:rocdevice.cpp :1858: 1885937272 us: Allocate hsa host memory 0x7fa4e0002000, size 0x28
:4:rocdevice.cpp :1858: 1885937696 us: Allocate hsa host memory 0x7fa460600000, size 0x101000
:4:rocdevice.cpp :1858: 1885937997 us: Allocate hsa host memory 0x7fa460400000, size 0x101000
:4:runtime.cpp :82 : 1885938102 us: init
:1:hip_code_object.cpp :456 : 1885938529 us: hipErrorNoBinaryForGpu: Unable to find code object for all current devices!
:1:hip_code_object.cpp :458 : 1885938540 us: Devices:
:1:hip_code_object.cpp :460 : 1885938542 us: amdgcn-amd-amdhsa--gfx902:xnack- - [Not Found]
:1:hip_code_object.cpp :465 : 1885938543 us: Bundled Code Objects:
:1:hip_code_object.cpp :482 : 1885938544 us: host-x86_64-unknown-linux - [Unsupported]
:1:hip_code_object.cpp :479 : 1885938546 us: hipv4-amdgcn-amd-amdhsa--gfx1030 - [code object v4 is amdgcn-amd-amdhsa--gfx1030]
:1:hip_code_object.cpp :479 : 1885938547 us: hipv4-amdgcn-amd-amdhsa--gfx803 - [code object v4 is amdgcn-amd-amdhsa--gfx803]
:1:hip_code_object.cpp :479 : 1885938549 us: hipv4-amdgcn-amd-amdhsa--gfx900:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx900:xnack-]
:1:hip_code_object.cpp :479 : 1885938550 us: hipv4-amdgcn-amd-amdhsa--gfx906:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx906:xnack-]
:1:hip_code_object.cpp :479 : 1885938552 us: hipv4-amdgcn-amd-amdhsa--gfx908:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx908:xnack-]
:1:hip_code_object.cpp :479 : 1885938553 us: hipv4-amdgcn-amd-amdhsa--gfx90a:xnack+ - [code object v4 is amdgcn-amd-amdhsa--gfx90a:xnack+]
:1:hip_code_object.cpp :479 : 1885938555 us: hipv4-amdgcn-amd-amdhsa--gfx90a:xnack- - [code object v4 is amdgcn-amd-amdhsa--gfx90a:xnack-]
/home/foo/.cache/yay/hip-rocclr/src/HIP-rocm-4.3.1/rocclr/hip_code_object.cpp:486: "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
[1] 17615 abort (core dumped) AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py
AMD_LOG_LEVEL=6 ../../env/bin/python 02-Clustering.py 2,52s user 3,90s system 141% cpu 4,544 total
from rocm-build.
At least i got HIP running:
❯ AMD_LOG_LEVEL=6 ./square.out
:3:rocdevice.cpp :430 : 10625438911 us: Initializing HSA stack.
:3:comgrctx.cpp :33 : 10625460831 us: Loading COMGR library.
:3:rocdevice.cpp :196 : 10625465529 us: Numa selects cpu agent[0]=0x205e2a0(fine=0x20f1f80,coarse=0x20f7560, kern_arg=0x210f8d0) for gpu agent=0x7fcbce7c3b34
:3:rocdevice.cpp :1562: 10625466470 us: HMM support: 0, xnack: 0
:4:rocdevice.cpp :1858: 10625466635 us: Allocate hsa host memory 0x7fcbcea34000, size 0x28
:4:rocdevice.cpp :1858: 10625467138 us: Allocate hsa host memory 0x7fcbcd400000, size 0x101000
:4:rocdevice.cpp :1858: 10625467587 us: Allocate hsa host memory 0x7fcbcd200000, size 0x101000
:4:runtime.cpp :82 : 10625467659 us: init
:3:hip_device.cpp :239 : 10625467704 us: 30526: [7fcbcddfb540] hipGetDeviceProperties: Returned hipSuccess :
info: running on device Cezanne
info: allocate host mem ( 7.63 MB)
info: allocate device mem ( 7.63 MB)
:3:hip_memory.cpp :384 : 10625470790 us: 30526: [7fcbcddfb540] hipMalloc ( 0x7ffca2147320, 4000000 )
:4:rocdevice.cpp :1993: 10625470946 us: Allocate hsa device memory 0x7fcbcc400000, size 0x3d0900
:3:rocdevice.cpp :2032: 10625470952 us: device=0x211d4b0, freeMem_ = 0xffc2f700
:3:hip_memory.cpp :386 : 10625470960 us: 30526: [7fcbcddfb540] hipMalloc: Returned hipSuccess : 0x7fcbcc400000: duration: 170 us
:3:hip_memory.cpp :384 : 10625470964 us: 30526: [7fcbcddfb540] hipMalloc ( 0x7ffca2147318, 4000000 )
:4:rocdevice.cpp :1993: 10625471018 us: Allocate hsa device memory 0x7fcbc0800000, size 0x3d0900
:3:rocdevice.cpp :2032: 10625471026 us: device=0x211d4b0, freeMem_ = 0xff85ee00
:3:hip_memory.cpp :386 : 10625471032 us: 30526: [7fcbcddfb540] hipMalloc: Returned hipSuccess : 0x7fcbc0800000: duration: 68 us
info: copy Host2Device
:3:hip_memory.cpp :429 : 10625471065 us: 30526: [7fcbcddfb540] hipMemcpy ( 0x7fcbcc400000, 0x7fcbcce2f010, 4000000, hipMemcpyHostToDevice )
:3:rocdevice.cpp :2543: 10625471616 us: number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp :2618: 10625478601 us: created hardware queue 0x7fcbcd5f5000 with size 1024 with priority 1, cooperative: 0
:4:rocdevice.cpp :1858: 10625478822 us: Allocate hsa host memory 0x7fcbcc980000, size 0x80000
:3:devprogram.cpp :2466: 10625710710 us: Using Code Object V4.
:4:command.cpp :303 : 10625712610 us: command is enqueued: 0x214adc0
:4:command.cpp :262 : 10625712653 us: queue marker to command queue: 0x20f1b20
:4:command.cpp :303 : 10625712656 us: command is enqueued: 0x205e500
:4:command.cpp :222 : 10625712657 us: waiting for event 0x214adc0 to complete, current status 3
:4:commandqueue.cpp :176 : 10625713048 us: command (CopyHostToDevice) is submitted: 0x214adc0
:4:rocvirtual.hpp :200 : 10625713254 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46b00
:4:rocvirtual.hpp :228 : 10625713263 us: [7fcbcd562640]! WaitNext completion_signal=0x7fcbcea46a80
:4:rocblit.cpp :670 : 10625713266 us: [7fcbcd562640]! HSA Asycn Copy wait_event=0x0, completion_signal=0x7fcbcea46b00
:4:rocvirtual.hpp :200 : 10625713701 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46b00
:4:rocvirtual.cpp :449 : 10625713708 us: [7fcbcd562640]! Host wait on completion_signal=0x7fcbcea46b00
:4:commandqueue.cpp :176 : 10625714354 us: command (InternalMarker) is submitted: 0x205e500
:4:rocvirtual.hpp :200 : 10625714368 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46b00
:4:command.cpp :236 : 10625714371 us: event 0x214adc0 wait completed
:4:command.cpp :152 : 10625714372 us: Command 0x214adc0 complete
:4:command.cpp :152 : 10625714374 us: Command 0x205e500 complete
:3:hip_memory.cpp :432 : 10625714379 us: 30526: [7fcbcddfb540] hipMemcpy: Returned hipSuccess : : duration: 243314 us
info: launch 'vector_square' kernel
:3:hip_platform.cpp :202 : 10625714411 us: 30526: [7fcbcddfb540] __hipPushCallConfiguration ( {512,1,1}, {256,1,1}, 0, stream:<null> )
:3:hip_platform.cpp :206 : 10625714419 us: 30526: [7fcbcddfb540] __hipPushCallConfiguration: Returned hipSuccess :
:3:hip_platform.cpp :213 : 10625714430 us: 30526: [7fcbcddfb540] __hipPopCallConfiguration ( {34542240,0,34538320}, {3458397079,32715,18}, 0x7ffca2147330, 0x7ffca2147328 )
:3:hip_platform.cpp :222 : 10625714433 us: 30526: [7fcbcddfb540] __hipPopCallConfiguration: Returned hipSuccess :
:3:hip_module.cpp :489 : 10625714444 us: 30526: [7fcbcddfb540] hipLaunchKernel ( 0x401c10, {512,1,1}, {256,1,1}, 0x7ffca2147370, 0, stream:<null> )
:3:devprogram.cpp :2466: 10625714623 us: Using Code Object V4.
:3:hip_module.cpp :358 : 10625715521 us: 30526: [7fcbcddfb540] ihipModuleLaunchKernel ( 0x0x21577a0, 131072, 1, 1, 256, 1, 1, 0, stream:<null>, 0x7ffca2147370, char array:<null>, event:0, event:0, 0, 0 )
:4:command.cpp :303 : 10625715595 us: command is enqueued: 0x215f780
:3:hip_platform.cpp :638 : 10625715619 us: 30526: [7fcbcddfb540] ihipLaunchKernel: Returned hipSuccess :
:3:hip_module.cpp :491 : 10625715635 us: 30526: [7fcbcddfb540] hipLaunchKernel: Returned hipSuccess :
info: copy Device2Host
:3:hip_memory.cpp :429 : 10625715651 us: 30526: [7fcbcddfb540] hipMemcpy ( 0x7fcbcca5e010, 0x7fcbc0800000, 4000000, hipMemcpyDeviceToHost )
:4:command.cpp :303 : 10625715657 us: command is enqueued: 0x214adc0
:4:command.cpp :262 : 10625715660 us: queue marker to command queue: 0x20f1b20
:4:command.cpp :303 : 10625715661 us: command is enqueued: 0x2157db0
:4:command.cpp :222 : 10625715662 us: waiting for event 0x214adc0 to complete, current status 3
:4:commandqueue.cpp :176 : 10625715663 us: command (KernelExecution) is submitted: 0x215f780
:3:rocvirtual.cpp :603 : 10625715679 us: ! arg0: = ptr:0x7fcbc0800000 obj:[0x7fcbc0800000-0x7fcbc0bd0900] threadId : 7fcbcd562640
:3:rocvirtual.cpp :603 : 10625715685 us: ! arg1: = ptr:0x7fcbcc400000 obj:[0x7fcbcc400000-0x7fcbcc7d0900] threadId : 7fcbcd562640
:3:rocvirtual.cpp :2560: 10625715689 us: [7fcbcd562640]! ShaderName : _Z13vector_squareIfEvPT_S1_m
:4:rocvirtual.cpp :753 : 10625715723 us: [7fcbcd562640] HWq=0x7fcbcd5f5000, Dispatch Header = 0x502 (type=2, barrier=1, acquire=2, release=0), setup=3, grid=[131072, 1, 1], workgroup=[256, 1, 1], private_seg_size=0, group_seg_size=0, kernel_obj=0x7fcbc0408840, kernarg_address=0x7fcbcc980000, completion_signal=0x0
:4:commandqueue.cpp :176 : 10625715741 us: command (CopyDeviceToHost) is submitted: 0x214adc0
:4:rocvirtual.hpp :200 : 10625717359 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46a80
:4:rocvirtual.hpp :228 : 10625717373 us: [7fcbcd562640]! WaitNext completion_signal=0x7fcbcea46a00
:4:rocvirtual.cpp :871 : 10625717384 us: [7fcbcd562640] HWq=0x7fcbcd5f5000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7fcbcea46a80
:4:rocvirtual.hpp :200 : 10625717406 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46a00
:4:rocvirtual.hpp :228 : 10625717414 us: [7fcbcd562640]! WaitNext completion_signal=0x7fcbcea46980
:4:rocblit.cpp :670 : 10625717420 us: [7fcbcd562640]! HSA Asycn Copy wait_event=0x0, completion_signal=0x7fcbcea46a00
:4:rocvirtual.hpp :200 : 10625718851 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46a00
:4:rocvirtual.cpp :449 : 10625718897 us: [7fcbcd562640]! Host wait on completion_signal=0x7fcbcea46a00
:4:commandqueue.cpp :176 : 10625719926 us: command (InternalMarker) is submitted: 0x2157db0
:4:rocvirtual.hpp :200 : 10625719960 us: [7fcbcd562640]! WaitCurret completion_signal=0x7fcbcea46a00
:4:command.cpp :152 : 10625719969 us: Command 0x215f780 complete
:4:command.cpp :152 : 10625719978 us: Command 0x214adc0 complete
:4:command.cpp :152 : 10625719981 us: Command 0x2157db0 complete
:4:command.cpp :236 : 10625719984 us: event 0x214adc0 wait completed
:3:hip_memory.cpp :432 : 10625720011 us: 30526: [7fcbcddfb540] hipMemcpy: Returned hipSuccess : : duration: 4360 us
info: check result
PASSED!
from rocm-build.
2 months after last posts, I will close this issue, please reopen if there is any updates.
from rocm-build.
@delijati I've been trying to get HIP running on a gfx902, but have had no luck. What version of rocm did you use? Did you have to build from source?
from rocm-build.
@delijati I have, in the last 2 weeks, spend significant time trying to get Pytorch on ROCm on gfx902 running. I experimented with different Linux versions but mostly Ubuntu, different ROCm / AMDGPU versions, with no luck. The few combinations I got working generally end with this error:
"HIP error: shared object initialization failed.
I am now declaring it a failure and impossibility, and for ML/DL testing getting a video card that does not use ROCm, as official ROCm support is about 6 cards today, no APUs.
from rocm-build.
Related Issues (20)
- How to Get This Working On An 18.04 Based System? HOT 3
- tensorflow error on gfx803 HOT 12
- question about rock-dkms script HOT 4
- libMLIRMIOpen not found problem while running script #35 HOT 11
- expanded `__noinline__` cannot compile anything past script 21, including script 22 in navi10 HOT 2
- boost.cmake needs to be added in repo; as [email protected] is invalid HOT 2
- navi10 pytorch build fails due to missing miopen HOT 6
- navi14 (gfx1012): git apply can not find file patch/22.rocblas-ninja-1.patch HOT 26
- rocm-build/navi10 for Stable Diffusion (for example: AUTOMATIC1111/stable-diffusion-webui) HOT 5
- How to build patched tensorflow package HOT 1
- Navi 14 patches fail in 5.3 HOT 3
- I can't seem to follow the installation guide. HOT 1
- Build with 5.4.2 HOT 2
- [Question] Is it possible to have precompiled Kernels for Navi 14? HOT 2
- Publish a Docker Container with ROCM HOT 3
- HIP failing to build HOT 3
- 28.rccl.sh fails to build for navi10 HOT 2
- ROCm 5.4.3, gfx1012, can not build 28.rccl.sh, error: instruction not supported on this GPU HOT 3
- 请问如何在rocm6.0上添加gfx803的支持呢 HOT 4
- Getting the 5500 XT to work with a custom build of PyTorch and ROCm HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rocm-build.