Git Product home page Git Product logo

automl's Introduction

Brain AutoML

This repository contains a list of AutoML related models and libraries.

automl's People

Contributors

88d52bdba0366127fffca9dfa93895 avatar bessszilard avatar brettkoonce avatar crazydonkey200 avatar damenianch avatar ely-s avatar etam103 avatar fsx950223 avatar glenvorel avatar jason-leeee avatar jcburnel avatar juzhiyuan avatar kartik4949 avatar kevits avatar leondgarse avatar lucassloan avatar mingxingtan avatar mmaithani avatar nihui avatar nikzak avatar rbournhonesque avatar sachinspanicker avatar sadransh avatar samjith888 avatar stepfenshawn avatar steve-2040 avatar wrannaman avatar wuhy08 avatar xiangning-chen avatar yongchanghao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

automl's Issues

'NoneType' object has no attribute 'value'

Hello, I'm using tf.2.1.0 and I have the following error:

Traceback (most recent call last):
File "main.py", line 383, in
tf.app.run(main)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "main.py", line 250, in main
FLAGS.train_batch_size))
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
rendezvous.raise_errors()
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 143, in raise_errors
six.reraise(typ, value, traceback)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
saving_listeners=saving_listeners)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1191, in _train_model_default
input_fn, ModeKeys.TRAIN))
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1028, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2987, in _call_input_fn
return input_fn(**kwargs)
File "/data01/wens/workspace/project/automl/efficientdet/dataloader.py", line 358, in call
dataset = dataset.map(_dataset_parser, num_parallel_calls=64)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2308, in map
self, map_func, num_parallel_calls, preserve_cardinality=False))
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3926, in init
use_legacy_function=use_legacy_function)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3147, in init
self._function = wrapper_fn._get_concrete_function_internal()
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2395, in _get_concrete_function_internal
*args, **kwargs)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2389, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2703, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2593, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 978, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3140, in wrapper_fn
ret = _wrapper_helper(*args)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3082, in _wrapper_helper
ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
File "/data01/wens/venv/tf20/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
raise e.ag_error_metadata.to_exception(e)
AttributeError: in converted code:

/data01/wens/workspace/project/automl/efficientdet/dataloader.py:320 _dataset_parser
    num_positives) = anchor_labeler.label_anchors(boxes, classes)
/data01/wens/workspace/project/automl/efficientdet/anchors.py:379 label_anchors
    anchor_box_list, gt_box_list, gt_labels)
/data01/wens/workspace/project/automl/efficientdet/object_detection/target_assigner.py:141 assign
    num_gt_boxes = groundtruth_boxes.num_boxes_static()
/data01/wens/workspace/project/automl/efficientdet/object_detection/box_list.py:75 num_boxes_static
    return self.data['boxes'].get_shape()[0].value

AttributeError: 'NoneType' object has no attribute 'value'

Actually, self.data['boxes'] has shape (None, 4).

Display loss?

Is there a way to display loss when training on GPUs?

Currently, all I see are the steps:

I0326 05:18:42.526468 139907207948160 tpu_estimator.py:2308] examples/sec: 14.628
INFO:tensorflow:global_step/sec: 0.922679
I0326 05:18:43.609837 139907207948160 tpu_estimator.py:2307] global_step/sec: 0.922679
INFO:tensorflow:examples/sec: 14.7629
I0326 05:18:43.610084 139907207948160 tpu_estimator.py:2308] examples/sec: 14.7629
INFO:tensorflow:global_step/sec: 0.96926

How to do prediction in video

Hi, i saw how to do detection in an image using model_inspect.py. But how to perform this detection in video or real-time camera input(i mean frame-by-frame)?

CudNN failed to initialize when calling model_inspect.py

I get an obscur cudnn error when running python3 model_inspect.py --runmode infer --model_name efficientdet-d1 --ckpt_path=path/to/efficientdet-d1 --input_imagepath/to/image.png --output_image_dir path/to/outputdir

My CUDA version is 10.1 and Tensorflow's is '2.1.0'

2020-03-24 07:35:36.937706: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-03-24 07:35:38.730679: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-24 07:35:38.733928: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1367, in _do_call
    return fn(*args)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1352, in _run_fn
    target_list, run_metadata)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1445, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node efficientnet-b1/model/stem/conv2d/Conv2D}}]]
	 [[strided_slice_10/_1837]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node efficientnet-b1/model/stem/conv2d/Conv2D}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "model_inspect.py", line 333, in <module>
    tf.app.run(main)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "model_inspect.py", line 328, in main
    inspector.run_model(FLAGS.runmode, FLAGS.threads)
  File "model_inspect.py", line 306, in run_model
    **config_dict)
  File "model_inspect.py", line 186, in inference_single_image
    driver.inference(image_image_path, output_dir, **kwargs)
  File "/home/jupyter-cyril/git_projects/automl/efficientdet/inference.py", line 289, in inference
    outputs_np = sess.run(detections_batch)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 960, in run
    run_metadata_ptr)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1183, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1361, in _do_run
    run_metadata)
  File "/home/jupyter-cyril/venv/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1386, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node efficientnet-b1/model/stem/conv2d/Conv2D (defined at /home/jupyter-cyril/git_projects/automl/efficientdet/backbone/efficientnet_model.py:637) ]]
	 [[strided_slice_10/_1837]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node efficientnet-b1/model/stem/conv2d/Conv2D (defined at /home/jupyter-cyril/git_projects/automl/efficientdet/backbone/efficientnet_model.py:637) ]]
0 successful operations.
0 derived errors ignored.

what the "tensorflow.google" and "tensorflow_addons"

when i run python main.py , error below:
(1)import tensorflow.google as tf
ModuleNotFoundError: No module named tensorflow.google'
(2)import tensorflow_addons as tfa
ModuleNotFoundError: No module named 'tensorflow_addons'
(3)import tf_slim as slim
ModuleNotFoundError: No module named 'tf_slim'

Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint

Hi, thank you for your hard work and open sourcing the code!
When I tried training with a single GPU in Colab, I got the following error:
Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint

Tensorflow version: 2.1.0

Command:

!python main.py --training_file_pattern=/content/TF_records/coco_train* --model_dir=models --hparams="use_bfloat16=false" --use_tpu=False \
  --model_name='efficientdet-d0'

Entire error:

INFO:tensorflow:Restoring parameters from models/model.ckpt-0
I0322 22:27:58.383603 140056498542464 saver.py:1284] Restoring parameters from models/model.ckpt-0
2020-03-22 22:28:01.751377: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
INFO:tensorflow:training_loop marked as finished
I0322 22:28:01.784756 140056498542464 error_handling.py:108] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W0322 22:28:01.784965 140056498542464 error_handling.py:142] Reraising captured error
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1367, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1352, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1445, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
  (0) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[{{node save/RestoreV2}}]]
  (1) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[{{node save/RestoreV2}}]]
	 [[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1290, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 960, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1183, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1361, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1386, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
  (0) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1493) ]]
  (1) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1493) ]]
	 [[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'save/RestoreV2':
  File "main.py", line 378, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "main.py", line 245, in main
    FLAGS.train_batch_size))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1198, in _train_model_default
    saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1493, in _train_with_estimator_spec
    log_step_count_steps=log_step_count_steps) as mon_sess:
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 604, in MonitoredTrainingSession
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1038, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 749, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1231, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1236, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 902, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 660, in create_session
    self._scaffold.finalize()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 243, in finalize
    self._saver.build()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build
    build_restore=build_restore)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 502, in _build_internal
    restore_sequentially, reshape)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 381, in _AddShardedRestoreOps
    name="restore_shard"))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps
    restore_sequentially)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1506, in restore_v2
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 742, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3322, in _create_op_internal
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1756, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/py_checkpoint_reader.py", line 70, in get_tensor
    self, compat.as_bytes(tensor_str))
RuntimeError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1300, in restore
    names_to_keys = object_graph_key_mapping(save_path)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1618, in object_graph_key_mapping
    object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/py_checkpoint_reader.py", line 74, in get_tensor
    error_translator(e)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/py_checkpoint_reader.py", line 35, in error_translator
    raise errors_impl.NotFoundError(None, None, error_message)
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 378, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "main.py", line 245, in main
    FLAGS.train_batch_size))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
    rendezvous.raise_errors()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 143, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1198, in _train_model_default
    saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1493, in _train_with_estimator_spec
    log_step_count_steps=log_step_count_steps) as mon_sess:
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 604, in MonitoredTrainingSession
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1038, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 749, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1231, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1236, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 902, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 669, in create_session
    init_fn=self._scaffold.init_fn)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/session_manager.py", line 294, in prepare_session
    config=config)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/session_manager.py", line 224, in _restore_checkpoint
    saver.restore(sess, ckpt.model_checkpoint_path)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1306, in restore
    err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

2 root error(s) found.
  (0) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1493) ]]
  (1) Not found: Key efficientnet-b0/blocks_0/conv2d/kernel not found in checkpoint
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1493) ]]
	 [[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'save/RestoreV2':
  File "main.py", line 378, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "main.py", line 245, in main
    FLAGS.train_batch_size))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1198, in _train_model_default
    saving_listeners)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1493, in _train_with_estimator_spec
    log_step_count_steps=log_step_count_steps) as mon_sess:
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 604, in MonitoredTrainingSession
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1038, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 749, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1231, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1236, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 902, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 660, in create_session
    self._scaffold.finalize()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 243, in finalize
    self._saver.build()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build
    build_restore=build_restore)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 502, in _build_internal
    restore_sequentially, reshape)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 381, in _AddShardedRestoreOps
    name="restore_shard"))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps
    restore_sequentially)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1506, in restore_v2
    name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 742, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3322, in _create_op_internal
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1756, in __init__
    self._traceback = tf_stack.extract_stack()

EfficientDet for EdgeTPU

HI, I wonder if there is a tutorial to follow to adapt this to edge tpu ? Thanks to any advice or information whether this runs on coral.

Thanks,
Alexander

bbox loss always zero

  1. Why bbox loss always zero(show in tensorboard) while training on signle GPU; anchor and gt mach problem?
  2. always problem as followed
    Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
    'tuple' object has no attribute 'name'
    ERROR:tensorflow:Model diverged with loss = NaN.
    E0325 20:20:13.613487 140640882104064 basic_session_run_hooks.py:768] Model diverged with loss = NaN.
    WARNING:tensorflow:Reraising captured error
    W0325 20:20:14.657608 140640882104064 error_handling.py:142] Reraising captured error

error in change num_classes

To fit my task, i change num_classes from 90 to 6 in hparams_config.py and model_inspect.py. But error happens.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [54] rhs shape= [810]
Errors may have originated from an input operation.
Input Source operations connected to node save/Assign_363:
class_net/class-predict/bias (defined at /home/work/zc/automl/efficientdet/efficientdet_arch.py:201)

I found it caused by somewhere about num_classed that not be changed .
how can i fix it ? and when i train custom model, do I need to change other code?

How to export a saved model ?

I found the output_node_names are a list, we should concat it to one tensor so that we can tranfer to saved model. Have any methos?

['class_net/class-predict/BiasAdd', 'class_net/class-predict_1/BiasAdd', 'class_net/class-predict_2/BiasAdd',
'class_net/class-predict_3/BiasAdd', 'class_net/class-predict_4/BiasAdd', 'box_net/box-predict/BiasAdd', 'box_net/box-predict_1/BiasAdd',
'box_net/box-predict_2/BiasAdd', 'box_net/box-predict_3/BiasAdd', 'box_net/box-predict_4/BiasAdd']

BFlops is not the same as the results reported in the paper.

Hi there! Excellent work and thank you for sharing the code!
I have a question about BFlops calculation. I used the efficientdet/model_inspect.py to compute the Efficientdet-D5 flops and got 270.77BFlops, which is about twice the 135 BFlops reported in paper.
Our results,
image
Reported in EfficientDet
截屏2020-03-26下午12 04 29

What's more, I noticed that with a unified unit (BFlops), the number of Flops of NAS-FPN is also half of that reported in your paper.
Reported in NAS-FPN,
2031585191959_ pic
Reported in EfficientDet,
1585195071970
It would be great if you could solve my confusion. Thanks a lot!

GPU train

When running this line of code "python main.py --training_file_pattern=/home/hhh/Data/YOLO/VOCdevkit/VOC2007/tfrecords_/voc_train* --model_dir=/tmp/efficientnet/ --hparams="use_bfloat16=false" --use_tpu=False",
the following error occurs
ERROR:tensorflow.python.training.basic_session_run_hooks.NanLossDuringTrainingError: NaN loss during training.
What is the reason?

Conv Bn ReLU pattern

Hi,

I would like to ask about the pattern of convolution, batch normalization, and relu for feature fusion. In the paper, it is suggested the 'conv-bn-relu' pattern, however, in this repo, the hyper-parameter config is set to be 'relu-conv-bn' pattern. Which one is more preferred?

Best regards.

tfrecord can‘t creat

when i follow the “How to convert train TFREcord for my own dataset #22“ ,I got the following error :

I0326 16:23:55.507034 2528 create_coco_tfrecord.py:289] writing to output path: tfrecord/val
Windows fatal exception: access violation

Current thread 0x000009e0 (most recent call first):
File "C:\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 84 in preread_check
File "C:\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 122 in read
File "C:\Anaconda3\lib\json_init
.py", line 293 in load
File "dataset/create_coco_tfrecord.py", line 260 in _load_images_info
File "dataset/create_coco_tfrecord.py", line 295 in _create_tf_record_from_coco_annotations
File "dataset/create_coco_tfrecord.py", line 363 in main
File "C:\Anaconda3\lib\site-packages\absl\app.py", line 250 in _run_main
File "C:\Anaconda3\lib\site-packages\absl\app.py", line 299 in run
File "C:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 40 in run
File "dataset/create_coco_tfrecord.py", line 367 in

how to solve this problem ? thanks a lot !!

pretrained-model load ERROR!

OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key box_net/box-0-bn-3/beta/Momentum not found in checkpoint

pre-training model loading error,please help me,thanks a lot!!!

ValueError: Variable efficientnet-b6/stem/conv2d/kernel already exists, disallowed

While looping through images like this:

driver = inference.InferenceDriver(MODEL, ckpt_path, image_size=image_size)
for m in imag_list:
    driver.inference(m,
             img_out_dir,
             min_score_thresh=min_score_thresh,
             max_boxes_to_draw=max_boxes_to_draw,
             line_thickness=line_thickness)

I am getting following error:

ValueError: Variable efficientnet-b6/stem/conv2d/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?

Detections much worse than YOLOv3?

I tried EfficientDet-D0 to compare its performance with Yolov3but the results of yolov3 are far superior than this one.

Results for ED-D0

0

Results for YOLOv3

sample_0

And here is the original image
sample_img

@mingxingtan is this expected?

EfficentDet Segmentation Model

Is the segmentation model also available as part of this repo? I could not find it. If not, when will it be uploaded by?

No module named 'tensorflow.compat.v2'

Hello, i'm using tf.1.13.1 and I have the following error:
Traceback (most recent call last):
File "main.py", line 28, in
import det_model_fn
File "/data01/wens/workspace/project/automl/efficientdet/det_model_fn.py", line 28, in
import efficientdet_arch
File "/data01/wens/workspace/project/automl/efficientdet/efficientdet_arch.py", line 31, in
import utils
File "/data01/wens/workspace/project/automl/efficientdet/utils.py", line 26, in
import tensorflow.compat.v2 as tf2
ModuleNotFoundError: No module named 'tensorflow.compat.v2'

Which version of tf shold I use? Thanks a lot.

Which version of tensorflow_addons to use?

I can only us pip to install tensorflow_addons of vesion greater than 0.5.0. But those versions of tensorflow_addons needs TF2.X.

Which version of tensorflow_addons is compatible with TF1.13?

Thanks.

why setting the box_loss_weight =50?

hello, i am confused about box_loss_weight for retinanet , can somebody explain why box_loss_weight is 50.0 in tpu while it was setted to 1.0 in mmdetection or detectron

Requested more than 0 entries, but params is empty

Hi,
I am using tensorflow 1.14.0, I am getting following error, while running in eval mode:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 383, in <module>
    tf.app.run(main)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/deepak/.local/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/deepak/.local/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "main.py", line 316, in main
    steps=FLAGS.eval_samples//FLAGS.eval_batch_size)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2897, in evaluate
    rendezvous.raise_errors()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 131, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2892, in evaluate
    name=name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 477, in evaluate
    name=name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 519, in _actual_eval
    return _evaluate()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 508, in _evaluate
    output_dir=self.eval_dir(name))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1609, in _evaluate_run
    config=self._session_config)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/evaluation.py", line 272, in _evaluate_once
    session.run(eval_ops, feed_dict)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1252, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1353, in run
    raise six.reraise(*original_exc_info)
  File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1338, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1411, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1169, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Requested more than 0 entries, but params is empty.  Params shape: [0,1]
	 [[{{node parser/GatherNd_1}}]]
	 [[IteratorGetNext]]

Finetune form pretrained checkpoint

I am trying to finetune RefineDet with pretrained checkpoint. Pretrained checkpoint are put in model_dir and start training, some variables are missing when restoring. Does anyone know how to finetune from pretrained EfficientDet other than EfficientNet backbone?

Training

How do I train on a custom dataset?

Can you provide a tutorial on how to build a model ?

Hello,

I just need to build an EfficientDet model (from name) and experiment with it inside a jupyter notebook. Can someone explain to me what functions I need to call to do so ? I don't want to use the cli

Thank's

Train error

Hi, thank you for your hard work and open sourcing the code!
I tried training, but the following error occurred.

Command:
python main.py --training_file_pattern=tmp/train/train* --model_name=efficientdet-d0 --model_dir=train_model --hparams="use_bfloat16=false" --use_tpu=False

2020-03-25 11:18:29.528543: W tensorflow/core/common_runtime/bfc_allocator.cc:429] ****************************************************************************************************
2020-03-25 11:18:29.528632: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at cwise_ops_common.h:263 : Resource exhausted: OOM when allocating tensor with shape[64,1152,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
WARNING:tensorflow:Reraising captured error
W0325 11:18:30.869304 140049157998336 error_handling.py:142] Reraising captured error
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1367, in _do_call
return fn(*args)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1352, in _run_fn
target_list, run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1445, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[64,1152,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node efficientnet-b0/model/blocks_13/Sigmoid}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[strided_slice_2/_15357]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[64,1152,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node efficientnet-b0/model/blocks_13/Sigmoid}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 385, in
tf.app.run(main)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "main.py", line 246, in main
FLAGS.train_batch_size))
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3035, in train
rendezvous.raise_errors()
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 143, in raise_errors
six.reraise(typ, value, traceback)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
saving_listeners=saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1198, in _train_model_default
saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1497, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 778, in run
run_metadata=run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 1283, in run
run_metadata=run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 1384, in run
raise six.reraise(*original_exc_info)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 1369, in run
return self._sess.run(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 1442, in run
run_metadata=run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py", line 1200, in run
return self._sess.run(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 960, in run
run_metadata_ptr)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1183, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1361, in _do_run
run_metadata)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1386, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[64,1152,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node efficientnet-b0/model/blocks_13/Sigmoid (defined at /home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py:370) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

     [[strided_slice_2/_15357]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[64,1152,16,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node efficientnet-b0/model/blocks_13/Sigmoid (defined at /home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py:370) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

Original stack trace for 'efficientnet-b0/model/blocks_13/Sigmoid':
File "main.py", line 385, in
tf.app.run(main)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "main.py", line 246, in main
FLAGS.train_batch_size))
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
saving_listeners=saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 374, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1164, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1194, in _train_model_default
features, labels, ModeKeys.TRAIN, self.config)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2857, in _call_model_fn
config)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1152, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3126, in _model_fn
features, labels, is_export_mode=is_export_mode)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 1663, in call_without_tpu
return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 1994, in _call_model_fn
estimator_spec = self._model_fn(features=features, **kwargs)
File "/home/ubuntu/project_1/automl/efficientdet/det_model_fn.py", line 567, in efficientdet_model_fn
model=efficientdet_arch.efficientdet)
File "/home/ubuntu/project_1/automl/efficientdet/det_model_fn.py", line 399, in _model_fn
cls_outputs, box_outputs = _model_outputs()
File "/home/ubuntu/project_1/automl/efficientdet/det_model_fn.py", line 389, in _model_outputs
return model(features, config=hparams_config.Config(params))
File "/home/ubuntu/project_1/automl/efficientdet/efficientdet_arch.py", line 552, in efficientdet
features = build_backbone(features, config)
File "/home/ubuntu/project_1/automl/efficientdet/efficientdet_arch.py", line 328, in build_backbone
override_params=override_params)
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_builder.py", line 324, in build_model_base
features = model(images, training=training, features_only=True)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 778, in call
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py", line 643, in call
for idx, block in enumerate(self._blocks):
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 339, in for_stmt
return py_for_stmt(iter, extra_test, body, get_state, set_state, init_vars)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 350, in _py_for_stmt
state = body(target, *state)
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py", line 662, in call
outputs = block.call(
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py", line 363, in call
if self._block_args.fused_conv:
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 920, in if_stmt
return _py_if_stmt(cond, body, orelse)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 1029, in _py_if_stmt
return body() if cond else orelse()
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py", line 369, in call
if self._block_args.expand_ratio != 1:
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 920, in if_stmt
return _py_if_stmt(cond, body, orelse)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py", line 1029, in _py_if_stmt
return body() if cond else orelse()
File "/home/ubuntu/project_1/automl/efficientdet/backbone/efficientnet_model.py", line 370, in call
x = self._relu_fn(self._bn0(expand_conv_fn(x), training=training))
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/custom_gradient.py", line 256, in call
return self._d(self._f, a, k)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/custom_gradient.py", line 212, in decorated
return _graph_mode_decorator(wrapped, args, kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/custom_gradient.py", line 316, in _graph_mode_decorator
result, grad_fn = f(*args)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/nn_impl.py", line 534, in swish
return features * math_ops.sigmoid(features), grad
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 3154, in sigmoid
return gen_math_ops.sigmoid(x, name=name)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 8750, in sigmoid
"Sigmoid", x=x, name=name)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/framework/op_def_library.py", line 742, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 3322, in _create_op_internal
op_def=op_def)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 1756, in init
self._traceback = tf_stack.extract_stack()

Add saved model support

Hey @mingxingtan, I'm interested in the repo and I have implemented saved model feature which replaces _generate_detections by a TensorFlow graph code. Could I submit a PR?

inferenced by pb

Hi, I want to detect a image lists in form of 'txt', and I change the code of build_input, but it also error in post process. Because the batch size of inference is 1, when send all images to the model, it also deal as 1 batch, then then anchor numbers will biger then index...
So, could you please publish an inference code to "change ckpt to pb " and inference by pb model for multi-images?

AssertionError: Bad argument number for Name: 3, expecting 4

I am using TensorFlow version: 2.0.0-beta0

I am getting following error:

INFO:tensorflow:fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]}
I0321 21:37:27.608264 139752298374976 efficientdet_arch.py:463] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]}
INFO:tensorflow:fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]}
I0321 21:37:27.608264 139752298374976 efficientdet_arch.py:463] fnode 0 : {'width_ratio': 0.015625, 'inputs_offsets': [3, 4]}
WARNING:tensorflow:Entity <bound method SeparableConv2D.call of <tensorflow.python.layers.convolutional.SeparableConv2D object at 0x7f1a4851c908>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method SeparableConv2D.call of <tensorflow.python.layers.convolutional.SeparableConv2D object at 0x7f1a4851c908>>: 

AssertionError: Bad argument number for Name: 3, expecting 4

WARNING:tensorflow:Entity <bound method BatchNormalization.call of <utils.BatchNormalization object at 0x7f1a4864d2b0>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method 

BatchNormalization.call of <utils.BatchNormalization object at 0x7f1a4864d2b0>>: 

AssertionError: Bad argument number for Name: 3, expecting 4

How can I use multi GPU?

python main.py --training_file_pattern=/coco_tfrecord/train* --model_dir=/tmp/efficientnet/ --hparams="use_bfloat16=false" --use_tpu=False

Hi
I use above command to train the model, but the program only use one gpu, can this program run in multi gpus?

looking forward for your reply

What are the parameters for training only some classes?

What are the parameters for training only some classes?

Which of the ways does EfficinetDet improve accuracy, learning only specific classes in a pre-learned model or learning only specific classes?

Is EfficientDet suitable for working on real-time video? (8 fps or more guaranteed on CPU only computer)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.