Hi,
I get the following error when attempting to train the Fg Bg reconstruction model, using tensorflow-gpu1.9, python2.7. Thank you for help.
gpu=0
D_arch='DCGAN'
log_dir='./logs'
####################### Stage-I: reconstruction #####################
## Fg Bg reconstruction
model_dir=${log_dir}'/MODEL1_Encoder_GAN_BodyROI7'
python main.py --dataset=Market_train_data \
--use_gpu=True --img_H=128 --img_W=64 \
--batch_size=16 --max_step=120000 \
--d_lr=0.00002 --g_lr=0.00002 \
--lr_update_step=50000 \
--model=1 \
--D_arch=${D_arch} \
--gpu=${gpu} \
--z_num=64 \
--model_dir=${model_dir} \
error message:
/home/ryu/sourcecode/Disentangled-Person-Image-Generation/tflib/plot.py:4: UserWarning:
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called before pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.
The backend was originally set to 'TkAgg' by the following code:
File "main.py", line 4, in
from trainer import *
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py", line 20, in
import models
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/models.py", line 6, in
from utils import *
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/utils.py", line 492, in
import matplotlib.pyplot as plt
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/matplotlib/pyplot.py", line 71, in
from matplotlib.backends import pylab_setup
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/matplotlib/backends/init.py", line 16, in
line for line in traceback.format_stack()
matplotlib.use('Agg')
load pn_pairs_num ......
######coord2channel_simple#####
######coord2channel_simple#####
Uppercase local vars:
BATCH_SIZE: 16
DATA_DIR:
DIM: 64
G_OUTPUT_DIM: 24576
IMG_H: 128
IMG_W: 64
ITERS: 200000
LAMBDA: 10
MODE: dcgan
WARNING:tensorflow:From /home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py:199: init (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2018-07-13 10:59:03,319:WARNING::From /home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py:199: init (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2018-07-13 10:59:06.651736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6325
pciBusID: 0000:02:00.0
totalMemory: 10.92GiB freeMemory: 10.76GiB
2018-07-13 10:59:06.651792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-13 10:59:07.038518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-13 10:59:07.038577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2018-07-13 10:59:07.038589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2018-07-13 10:59:07.038950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10411 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
[] MODEL dir: ./logs/MODEL1_Encoder_GAN_BodyROI7
[] PARAM path: ./logs/MODEL1_Encoder_GAN_BodyROI7/params.json
2018-07-13 10:59:21.620532: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: part_bbox_1. Can't parse serialized Example.
2018-07-13 10:59:21.620573: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: part_bbox_1. Can't parse serialized Example.
2018-07-13 10:59:21.621055: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: part_bbox_1. Can't parse serialized Example.
2018-07-13 10:59:21.621579: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: part_bbox_1. Can't parse serialized Example.
Traceback (most recent call last):
File "main.py", line 90, in
main(config)
File "main.py", line 82, in main
trainer.train()
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py", line 328, in train
part_bbox_target_fixed, part_vis_fixed, part_vis_target_fixed = self.get_image_from_loader()
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py", line 530, in get_image_from_loader
self.mask_r6, self.mask_r6_target, self.part_bbox, self.part_bbox_target, self.part_vis, self.part_vis_target])
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_3_batch/fifo_queue' is closed and has insufficient elements (requested 16, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8, DT_UINT8, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64, DT_INT64, DT_INT64, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
Caused by op u'batch', defined at:
File "main.py", line 90, in
main(config)
File "main.py", line 23, in main
trainer = DPIG_Encoder_GAN_BodyROI_FgBg(config)
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py", line 42, in init
self.part_bbox, self.part_bbox_target, self.part_vis, self.part_vis_target = self._load_batch_pair_pose(self.dataset_obj)
File "/home/ryu/sourcecode/Disentangled-Person-Image-Generation/trainer.py", line 555, in _load_batch_pair_pose
batch_size=self.batch_size, num_threads=self.num_threads, capacity=self.capacityCoff * self.batch_size)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 988, in batch
name=name)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 762, in _batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/home/ryu/anaconda3/envs/pose/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
OutOfRangeError (see above for traceback): FIFOQueue '_3_batch/fifo_queue' is closed and has insufficient elements (requested 16, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8, DT_UINT8, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64, DT_INT64, DT_INT64, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]