Comments (10)
@Hex000 Thanks for contacting us. While I go and run your test code and understand the issue, can you please tell me about your system configuration?
- Which distro?
- Output of
uname -a
- Output of
/opt/rocm/bin/hipconfig
How was MIOpen installed? Was it built from source?
from miopen.
@Hex000 I am mostly certain something is incorrect with your environment. On my Vega GPU I see quite similar times to NVIDIA 1080Ti for that particular layer you mentioned above. Can you please let me know the exact model of your GPU as well.
12 13.0 ns (76852 ips) conv {28, 28, 128} -> {28, 28, 128} by filter {3, 3} with stride {1, 1} algo 3 expected 11.9 ns 0.0 Mb
from miopen.
@dagamayank, thank you!
This is good news, I need to check my configuration.
GPU is AMD RX Vega 64
hipconfig output:
https://gist.github.com/hex000/2fdc37178854038cf7600133acebf92d
The MiOpen installation info:
https://gist.github.com/hex000/12462cd6e0a75796b9beb24b6a798f4b
Official binary distribution from http://repo.radeon.com/rocm, is it the best origin?
Or maybe it's better to build from source?
Driver info:
https://gist.github.com/hex000/1c311c4a6fc5545b5aba4231bbf9300b
It's integrated into kernel, is it the best way?
from miopen.
@dagamayank We tried to run with driver provided with kernel 4.11.0-kfd-compute-rocm-rel-1.6-148, from http://repo.radeon.com/rocm and got around 120 ns.
With ubuntu default kernel, 4.4.0-62-generic, and AMDGPU-PRO 17.30 driver we got 19 ns, which is much better, but still more than your result.
from miopen.
@Hex000 @prostowww Please consider the performance that I shared as "dev-preview".
MIOpen requires some improvements in the base software stack which are planned to be released within the next two weeks as part of ROCm 1.6.4. You systems are currently configured with ROCm 1.6.3 (the last public release), which may be the cause of the poor performance that you notice.
I will highly recommend not to mix the AMDGPU-PRO and ROCm software stacks on the same system.
As an experiment can you please try setting this env. variable and check the performance again?
export MIOPEN_DEBUG_AMD_ROCM_PRECOMPILED_BINARIES=0
I will ping you once ROCm 1.6.4 is public.
from miopen.
@Hex000 just curious, do you have both cuda and rocm installed on the same machine? Have you experienced any issue?
from miopen.
Dear friends, thank you for the answers!
No, CUDA and ROCM are on different machines, so this was not tested.
One more complain from me as a user, warping nine-dimensional space for 30 min is too tough:
[02] Testing conv {56, 56, 64} -> {56, 56, 64} by filter {3, 3} with stride {1, 1}
Searching the best solution in the 9 dim space. Please, be patient it may take few minutes.
Runs left : 13714, min time so far : 3.7838, curr time : 3.96322, 8, 16, 8, 16, 1, 2, 5, 3, 1
from miopen.
Just wanted to let you know we released ROCm 1.6.4 last week along with MIOpen v1.1.4, if you are already not aware. Can you please try updating your systems and run the performance experiment again?
For performance measurements I would first like you to set an additional parameter for now.
sudo -s
echo 1 > /sys/module/amdkfd/parameters/noretry
exit
The above is right now a workaround for an issue with page migration in ROCm. This will be fixed soon.
from miopen.
dagamayank, thank you!
noretry gives ~10%
from miopen.
@Hex000 FYI - there is a new release of both ROCm and MIOpen, so you may want to benchmark again. For now, I am closing this issue. Please create a new issue if you have questions or notice discrepancies.
from miopen.
Related Issues (20)
- Tests for MIOPEN_BACKEND_ENGINECFG_DESCRIPTOR
- Tests for MIOPEN_BACKEND_EXECUTION_PLAN_DESCRIPTOR
- [MHA] Implement MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR
- MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR Implementation
- MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR Tests
- [gTest] Environmental variable to control applicable platforms for the tests HOT 9
- Unable to builld MIOpen from source on SLES HOT 1
- [XNACK] GPU is asleep during the copy and not waking back up when it should. HOT 7
- Build failure Unable to found boost_filesystem HOT 2
- Reproduce kernel cache binary files in MIOpen HOT 1
- [urgent] HipBuildImpl() seems fail to create temp dir or due to no GPU presents in build env HOT 6
- [WA] re-enable conv_hip_implicit_gemm_grouped_bwd_xdlops after kernel dev work
- Softmax invoking FP16 kernel for FP32 input
- [GEMM group conv] Incorrect GPU time when GemmBwd1x1_stride1 and GemmFwd1x1_0_1 are invoked in "Run" mode. HOT 1
- [URGENT] error: no member named 'for_each_n' in namespace 'std' HOT 5
- Excessive warning messages on workspace provided and required (IsEnoughWorkspace) after #2947 HOT 3
- Build process, which to follow? HOT 1
- [Windows] graphapi gtests are not compiling due to missing class methods HOT 2
- Regenerating KDB in the `develop` branch is currently not possible. HOT 2
- MIOpen unit test link issue : ld.lld: error: undefined symbol: dladdr and undefined reference due to --no-allow-shlib-undefined HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from miopen.