Comments (3)
This is happening in the hipModuleLoad
in MIOpen, not when allocating the device buffers for the network. It looks like HIP
reports hipErrorOutOfMemory
when hsa_memory_allocate
fails, here, which could fail for other reasons than out of memory. Try running with ltrace -e 'hsa_*' ./alexnet
to get the error status for hsa functions.
from miopen.
Thanks, here is relevant portion of the output.
hsa_memory_allocate(0x29fc8d0, 0, 0x7ffebf1623f8, 0xfffffffffffab834) is returning 4097 (= 0x1001, HSA_STATUS_ERROR_INVALID_ARGUMENT).
Here, the size argument to hsa_memory_allocate is 0. This happens because the input hsaco file is empty, which seems definitely wrong.
Do you have any idea why this is happening?
[INFO] Conv(11x11,pad=2,s=4) (128,3,224,224)->(128,64,55,55): 7.885 ms
runcl -DMLO_NRN_GROUP_SZ0=256 -DMLO_NRN_GROUP_SZ1=1 -DMLO_NRN_OP_ID=3 -DMLO_N_PIXS_OFF=0 -DMLO_MAP_SZ=24780800 -DMLO_MAP_SZ_ALIGNED=6195200 -DMLO_READ_UNIT=4 src/Kernels/MIOpenNeuron.cl -k MIOpenNeuronFwd -dumpilisa -r 10 if#0: if#0: if#0: iv#0 6195200,1,1/256,1,1
key: miopenActivationForward,64x55x55x3x3x64x55x55x128xNCHWxFP32x1
Kernel filename: MIOpenNeuron.cl
libhip_hcc.so->hsa_agent_iterate_regions(0x28aa0b0, 0x7f68c5a3d160, 0x7ffebf162418, 0 <unfinished ...>
libhip_hcc.so->hsa_region_get_info(0x28aa4c0, 0, 0x7ffebf1623a4, 1) = 0
libhip_hcc.so->hsa_region_get_info(0x28aa4c0, 1, 0x7ffebf1623a0, 0xfffffffffffab814) = 0
libhip_hcc.so->hsa_region_get_info(0x28aa520, 0, 0x7ffebf1623a4, 0xfffffffffffab834) = 0
libhip_hcc.so->hsa_region_get_info(0x29fc8d0, 0, 0x7ffebf1623a4, 0xfffffffffffab814) = 0
libhip_hcc.so->hsa_region_get_info(0x29fc8d0, 1, 0x7ffebf1623a0, 0xfffffffffffab814) = 0
libhip_hcc.so->hsa_region_get_info(0x29fc950, 0, 0x7ffebf1623a4, 0xfffffffffffab834) = 0
libhip_hcc.so->hsa_region_get_info(0x29fc950, 1, 0x7ffebf1623a0, 0xfffffffffffab814) = 0
<... hsa_agent_iterate_regions resumed> ) = 0
libhip_hcc.so->hsa_memory_allocate(0x29fc8d0, 0, 0x7ffebf1623f8, 0xfffffffffffab834) = 4097
libhip_hcc.so->hsa_executable_destroy(0, 0, 0x29205e8, 0) = 4113
libhip_hcc.so->hsa_code_object_destroy(0, 0, 4113, 0) = 4112
libhip_hcc.so->hsa_memory_free(0, 0, 4112, 0) = 0
MIOpen Error: /home/masa/MIOpen/src/hipoc/hipoc_program.cpp:96: Failed creating module hipErrorOutOfMemory
error: 'StatusUnknownError '(7) at ./layers.hpp:277
from miopen.
Ah, after I deleted kernel caches in .cache/miopen, it worked.
I can even run vgg 19 and resnet 101 benchmark. This is great.
Just saying, raising hipErrorOutOfMemory
error when hsa_memory_allocate
can fail for other reasons is confusing...
Thanks!
from miopen.
Related Issues (20)
- Implementation of MIOPEN_BACKEND_EXECUTION_PLAN_DESCRIPTOR
- Tests for MIOPEN_BACKEND_ENGINE_DESCRIPTOR
- Tests for MIOPEN_BACKEND_ENGINECFG_DESCRIPTOR
- Tests for MIOPEN_BACKEND_EXECUTION_PLAN_DESCRIPTOR
- [MHA] Implement MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR
- MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR Implementation
- MIOPEN_BACKEND_ENGINEHEUR_DESCRIPTOR Tests
- [gTest] Environmental variable to control applicable platforms for the tests HOT 9
- Unable to builld MIOpen from source on SLES HOT 1
- [XNACK] GPU is asleep during the copy and not waking back up when it should. HOT 7
- Build failure Unable to found boost_filesystem HOT 2
- Reproduce kernel cache binary files in MIOpen HOT 1
- [urgent] HipBuildImpl() seems fail to create temp dir or due to no GPU presents in build env HOT 6
- [WA] re-enable conv_hip_implicit_gemm_grouped_bwd_xdlops after kernel dev work
- Softmax invoking FP16 kernel for FP32 input
- [GEMM group conv] Incorrect GPU time when GemmBwd1x1_stride1 and GemmFwd1x1_0_1 are invoked in "Run" mode. HOT 1
- [URGENT] error: no member named 'for_each_n' in namespace 'std' HOT 5
- Excessive warning messages on workspace provided and required (IsEnoughWorkspace) after #2947 HOT 3
- Build process, which to follow? HOT 1
- [Windows] graphapi gtests are not compiling due to missing class methods HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from miopen.