Comments (6)
Hi @pratishthavrm ,
This seems to be a Keras issue #9394.
from deephar.
thanx a lot
from deephar.
Hi,
while running train_mpii_singleperson.py getting following error.
W tensorflow/core/framework/op_kernel.cc:1431] OP_REQUIRES failed at constant_op.cc:170 : Resource exhausted: OOM when allocating tensor with shape[20,576,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
return fn(*args)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1320, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1408, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[20,576,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node SepConv2/batch_normalization_16/FusedBatchNorm}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[loss/concatenate_12_loss/Mean_3/_3811]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/ptvsd_launcher.py", line 45, in
main(ptvsdArgs)
File "/home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/lib/python/ptvsd/main.py", line 357, in main
run()
File "/home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/lib/python/ptvsd/main.py", line 257, in run_file
runpy.run_path(target, run_name='main')
File "/usr/lib/python3.5/runpy.py", line 254, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/#####/project_deeper/exp/mpii/train_mpii_singleperson.py", line 97, in
initial_epoch=0)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/engine/training.py", line 2244, in fit_generator
class_weight=class_weight)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/engine/training.py", line 1890, in train_on_batch
outputs = self.train_function(ins)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2475, in call
**self.session_kwargs)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 930, in run
run_metadata_ptr)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1153, in _run
feed_dict_tensor, options, run_metadata)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1329, in _do_run
run_metadata)
File "/home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1349, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[20,576,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node SepConv2/batch_normalization_16/FusedBatchNorm (defined at home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:1799) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[loss/concatenate_12_loss/Mean_3/_3811]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Errors may have originated from an input operation.
Input Source operations connected to node SepConv2/batch_normalization_16/FusedBatchNorm:
SepConv2/separable_conv2d_6/separable_conv2d (defined at home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:3478)
batch_normalization_16/beta/read (defined at home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:392)
SepConv2/batch_normalization_16/Const (defined at home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:1788)
Original stack trace for 'SepConv2/batch_normalization_16/FusedBatchNorm':
File "home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/ptvsd_launcher.py", line 45, in
main(ptvsdArgs)
File "home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/lib/python/ptvsd/main.py", line 357, in main
run()
File "home/#####/.vscode/extensions/ms-python.python-2019.2.5433/pythonFiles/lib/python/ptvsd/main.py", line 257, in run_file
runpy.run_path(target, run_name='main')
File "usr/lib/python3.5/runpy.py", line 254, in run_path
pkg_name=pkg_name, script_name=fname)
File "usr/lib/python3.5/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "home/#####/project_deeper/exp/mpii/train_mpii_singleperson.py", line 51, in
num_blocks=num_blocks, num_context_per_joint=2, ksize=(5, 5))
File "home/#####/project_deeper/deephar/models/reception.py", line 285, in build
x = build_sconv_block(x, name='SepConv%d' % (bidx + 1), ksize=ksize)
File "home/#####/project_deeper/deephar/models/reception.py", line 142, in build_sconv_block
return model(inp)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/engine/topology.py", line 617, in call
output = self.call(inputs, **kwargs)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/engine/topology.py", line 2081, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/engine/topology.py", line 2232, in run_internal_graph
output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/layers/normalization.py", line 181, in call
epsilon=self.epsilon)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1824, in normalize_batch_in_training
epsilon=epsilon)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1799, in _fused_normalize_batch_in_training
data_format=tf_data_format)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/ops/nn_impl.py", line 1206, in fused_batch_norm
name=name)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 3946, in _fused_batch_norm
name=name)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 800, in _apply_op_helper
op_def=op_def)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3479, in create_op
op_def=op_def)
File "home/#####/virtualenvironment/project_2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1961, in init
self._traceback = tf_stack.extract_stack()
Please tell me how to fix this.
from deephar.
Please note in your log:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM
You need a bigger GPU or smaller batches for training the model.
from deephar.
Thanx a lot
But what is the minimum system requirement for mpii training.
from deephar.
There is no a single answer for that question.
By reducing the batch size you should be able to train it even on small GPUs.
from deephar.
Related Issues (20)
- Testing Script File : Testing script is not Present HOT 1
- Question about action recognition on NTU HOT 11
- Training hyperparameters for train_penn_multimodel.py HOT 2
- Is the model working for multiple person on a crowded scene ? (Not issue) HOT 1
- Weights for predict bboxes on NTU HOT 2
- No weights provided for the multitask models HOT 3
- Questions regarding Pennaction folder HOT 2
- question about displaying output of pose estimation, action recognition HOT 2
- Training on my dataset HOT 1
- could the network be trained with only action video but no pose images/video? HOT 4
- a quick demo for action recognition HOT 4
- something wrong about the eval_penn_multitask.py HOT 2
- eval_penn_ar_pe_merge.py error HOT 1
- Visualization Issue
- ModuleNotFoundError
- Question about datasets and annotations HOT 2
- Training on custom dataset HOT 1
- Error while training penn mpii multimodel HOT 2
- General question with resepect to paper HOT 2
- run.sh is killed HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deephar.