a-bone1 / attention-ocr-chinese-version Goto Github PK

View Code? Open in Web Editor NEW

431.0 27.0 141.0 12.06 MB

Attention OCR Based On Tensorflow

Python 100.00%

attention-model text-recognition tensorflow crnn

attention-ocr-chinese-version's Introduction

Attention-ocr-Chinese-Version

The progress was used to Chinese OCR based on Google Attention OCR.

Modify Google's attention model for Chinese text recognition.

More details can be found in this paper:"Attention-based Extraction of Structured Information from Street View Imagery" and Chinese introduction of this project click here

This project can run on Windows10 and Ubuntu 16.04, using the python3 environment and The network is built using tensorflow

According to the official website, I generated FSNS format tfrecord for Chinese text recognition and a dictionary of 5,400 Chinese characters. The method of generating FSNS tfrecord can be referred to here.https://github.com/A-bone1/FSNS-tfrecord-generate

overall framework of the network（Attention-CRNN）

train your own model

1、Store data in the same format as the FSNS datasetand put the tfrecord and dic.txt under datasets / data / fsns / train / ,then just reuse the python/datasets/fsns.py module. E.g., create a file datasets/newtextdataset.py， You can imitate this newtextdataset.py, modify some simple parameters and paths on it

2、You will also need to include it into the datasets/init.py and specify the dataset name in the command line.If you are modifying directly on my newtextdataset.py, you do not have to do this step

3、train your own model

cd python
python train.py --dataset_name=newtextdataset

4、（ps）My machine's memory of GPU is not enough to support me training this model, so I temporarily set it to only cpu training, if you want to train in the GPU, then Comment these two lines in the train.py

import os
os.environ['CUDA_VISIBLE_DEVICES'] = ''

5、 The required files of tensorboard are stored under / logs and can be viewed using the commands below.

tensorboard  --logdir=logs

Some suggestions for training

You can use the Curriculum Learning strategy to accelerate convergence and improve the model's generalization ability.first, training with simple background training samples , and then gradually adding real, complex natural scene text pictures to increase sample complexity.
The model has high requirements for the memory of GPU. If the memory does not meet the training requirements.You can reduce the image size when the training sample is generated.and then Modify the image parameters in the Training code（image_shape' in the /python/datasets/newtextdataset.py

Loss Function

Original Image

Predictive text

Verify your own model

1、Generate your validation FSNS tfrecord and name it train_eval*, then place it under datasets / data / fsns / train /

2、Verify your own model

python eval.py

3、The results can be view used tensorboard , the required documents stored under / tmp / attention_ocr / eval

tensorboard  --logdir=/tmp/attention_ocr/eval

Accuracy

Now,The character accuracy of 1.8million Synthetic pictures is 92.96%,and the sequence accuracy is 80.18%

How to use a trained model

python demo_inference.py --batch_size=32 \
  --checkpoint=model.ckpt-399731\
  --image_path_pattern=./datasets/data/fsns/temp/fsns_train_%02d.png

attention-ocr-chinese-version's People

Contributors

Stargazers

Watchers

Forkers

bygreencn kyocen lesley96-11 elagjun alexliyang kitter blackarrow3542 xjtuwj yanzhezhangleon lengjiyi xingguotian davidzhanglibra xzm2004260 ocean1100 gisxin christianashannon fendaq matrixplayer icaffe aiedward xggiou xianfengju zhengxiaopeng zergmk2 searobbersduck bmyan jiangxiluning arcral cloudfool peterwon zoonono frankfqchen lunaczp fresty fxwispig syzlhh zhuifeng7000 tsing-cv linzongkao nancy006 tangzixia carolinexull liben2018 tjussh zhuguangqiang jinfei3459 gzzhao zgsxwsdxg 100cm jadentan cvding diaaesmail weitaoatvison wind-l fireae hongminli jiulongcui phrmgb ieee820 crossli wangxiaocao yulinlin0828 jdc08161063 maozhiqiang zhongkailv mysee1989 huguanglong dreadlord1984 ouya-bytes chagge wibruce linzhi123 govan111 nlpformyself wangyanna1991 arisosoftware daming98 anhvth silicon2006 zhangtaozhir elavin11 gccrpm lukecsq ljxljxljx etrigger mantianlong rodney-wang 21toanyone-pro happog yuliangzhang titan-pycompat jackyan renhuaqiang waynesuzq zguqing john296266807 lijian10086 thanhhoang283 missyangx cold-eye

attention-ocr-chinese-version's Issues

测试结果有误

INFO 2018-09-03 15:52:55.000546: tf_logging.py: 82 Restoring parameters from model.ckpt-834456
b'\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba'
b'\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba\xe7\xbc\xba'

预测问题

在我预测的时候只能预测出最后一个字符，其他的字符都预测不出来，
Reading C:\Elag\data\cn_ocr\train\31c768fe-444d-11e8-9e87-005056bed121.jpg INFO:tensorflow:Graph was finalized. INFO 2018-04-24 09:37:51.000062: tf_logging.py: 116 Graph was finalized. INFO:tensorflow:Restoring parameters from logs/model.ckpt-102015 INFO 2018-04-24 09:37:51.000064: tf_logging.py: 116 Restoring parameters from logs/model.ckpt-102015 INFO:tensorflow:Running local_init_op. INFO 2018-04-24 09:37:51.000859: tf_logging.py: 116 Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO 2018-04-24 09:37:51.000911: tf_logging.py: 116 Done running local_init_op. 徘░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

图像里的字符是：簇paSvH哄7徘

我也试了别的图像，都只预测出最后一个，不知道哪里出了问题
这是参数
--batch_size=1 --checkpoint=logs/model.ckpt-102015 --image_path_pattern=C:\Elag\data\cn_ocr\train\31c768fe-444d-11e8-9e87-005056bed121.jpg --dataset_name=newtextdataset

期待回复

endpoints.predicted_text outputs []

Hello!
Thank you for your sharing.
When I run your code, I find that the endpoint.predicted_text outputs nothing, is it right?

运行程序训练时报错

Instructions for updating:
Please switch to tf.train.get_or_create_global_step
INFO:tensorflow:Restoring parameters from /home/ucmed/opt/python/models-master/research/attention_ocr/python/logs/model.ckpt-0
INFO 2019-01-03 02:14:41.000888: tf_logging.py: 82 Restoring parameters from /home/ucmed/opt/python/models-master/research/attention_ocr/python/logs/model.ckpt-0
INFO:tensorflow:Starting Session.
INFO 2019-01-03 02:14:55.000713: tf_logging.py: 82 Starting Session.
INFO:tensorflow:Saving checkpoint to path /home/ucmed/opt/python/models-master/research/attention_ocr/python/logs/model.ckpt
INFO 2019-01-03 02:14:55.000829: tf_logging.py: 82 Saving checkpoint to path /home/ucmed/opt/python/models-master/research/attention_ocr/python/logs/model.ckpt
INFO:tensorflow:Starting Queues.
INFO 2019-01-03 02:14:55.000832: tf_logging.py: 82 Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO 2019-01-03 02:15:00.000650: tf_logging.py: 121 global_step/sec: 0
Killed

模型收敛问题

你好，想问一下，我在中文数据集上训练后，loss从150左右降到35，但模型对于不同图片的预测结果始终是相同的37个汉字，比如“并并并.......”，能请教一下可能是什么问题吗？或者是什么问题导致模型不收敛呢？

数据集分享

请问可以分享下训练的数据集吗

[solved]模型不收敛

多行文字

您好，请问一下，对于图片中的多行文字( >= 2)，是必须要进行行切割吗？

数据集问题

请问一下你这个训练的时候也不分训练集，测试集，验证集吗

可以提供预训练模型吗

你好，可以提供预训练模型吗？

Support output predict

After trained model with your tutorial. (Using your dic.txt and fsns in data/train/tfexample.record). Then i use the model-checkpoint to infer 2 image. But the output is not right. Please help me!

训练好的模型能共享一下吗

训练时间太长了能不能共享一下你已经训练好的模型

训练loss问题

请问loss越来越大是怎么回事

请问能将最大长度37更改吗

37更改后报错

训练时cpu占满死机

大家好，请问大家都是用cpu训练的吗？我的cpu是16g, i7-7700HQ，训练的时候内存和使用率都满了，然后就死机了，batch-size降低到1也没有作用。请问大家是如何用cpu训练的呢？

跑demo_inference_test.py出错

我只简单运行了 python demo_inference_test.py
出现下面的错误，请指点修正，谢谢！

======================================================================
ERROR: test_moving_variables_properly_loaded_from_a_checkpoint (main.DemoInferenceTest)

Traceback (most recent call last):
File "demo_inference_test.py", line 26, in test_moving_variables_properly_loaded_from_a_checkpoint
dataset_name)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/demo_inference.py", line 72, in create_model
endpoints = model.create_base(images, labels_one_hot=None)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/model.py", line 374, in create_base
chars_logit = self.sequence_logit_fn(net, labels_one_hot)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/model.py", line 242, in sequence_logit_fn
layer = layer_class(net, labels_one_hot, self._params, mparams)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/sequence_layers.py", line 381, in init
super(AttentionWithAutoregression, self).init(*args, **kwargs)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/sequence_layers.py", line 353, in init
super(Attention, self).init(*args, **kwargs)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/sequence_layers.py", line 124, in init
regularizer=regularizer)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 262, in model_variable
use_resource=use_resource)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/usr/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 217, in variable
use_resource=use_resource)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1203, in get_variable
constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1092, in get_variable
constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 425, in get_variable
constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 394, in _true_getter
use_resource=use_resource, constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 805, in _get_single_variable
constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 213, in init
constraint=constraint)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 303, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 779, in
shape.as_list(), dtype=dtype, partition_info=partition_info)
File "/export/wangruifang/projects/attention-ocr-chinese-test/python/sequence_layers.py", line 78, in orthogonal_initializer
u, _, v = np.linalg.svd(w, full_matrices=False)
File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 1423, in svd
_assertNoEmpty2d(a)
File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 225, in _assertNoEmpty2d
raise LinAlgError("Arrays cannot be empty")
LinAlgError: Arrays cannot be empty

请问怎样可以修改为用不同尺寸数据走batch=1的训练

想请问大神一下，我想用它处理我的文档图片，经过切分后图片尺寸为（32，长度不定，3），我想通过batch=1来训练不同尺寸的图片。由于想提高模型速度，若将训练集padding成同样大小会在测试的时候降低速度。
目前我转化好了tfrecord
在运行时候报错，是尺寸问题
2018-08-08 10:54:00.548359: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-08 10:54:00.833290: E tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2018-08-08 10:54:00.833429: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: ksai-GPUSERVER_V100_1
2018-08-08 10:54:00.833443: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: ksai-GPUSERVER_V100_1
2018-08-08 10:54:00.833504: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.111.0
2018-08-08 10:54:00.833584: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.111.0
2018-08-08 10:54:00.833597: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:300] kernel version seems to match DSO: 384.111.0
INFO 2018-08-08 10:54:16.000140: model.py: 581 Restoring checkpoint(s)
INFO:tensorflow:Running local_init_op.
INFO 2018-08-08 10:54:16.000140: tf_logging.py: 115 Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO 2018-08-08 10:54:16.000395: tf_logging.py: 115 Done running local_init_op.
INFO:tensorflow:Starting Session.
INFO 2018-08-08 10:54:38.000205: tf_logging.py: 115 Starting Session.
INFO:tensorflow:Saving checkpoint to path ./model.ckpt
INFO 2018-08-08 10:54:38.000788: tf_logging.py: 115 Saving checkpoint to path ./model.ckpt
INFO:tensorflow:Starting Queues.
INFO 2018-08-08 10:54:38.000832: tf_logging.py: 115 Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO 2018-08-08 10:54:51.000791: tf_logging.py: 159 global_step/sec: 0
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Input to reshape is a tensor with 26880 values, but the requested shape has 57600
[[Node: Reshape_6 = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](case/cond/Merge, PreprocessImage/AugmentImage/Shape)]]
INFO 2018-08-08 10:55:01.000739: tf_logging.py: 115 Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Input to reshape is a tensor with 26880 values, but the requested shape has 57600
[[Node: Reshape_6 = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](case/cond/Merge, PreprocessImage/AugmentImage/Shape)]]
INFO:tensorflow:Caught OutOfRangeError. Stopping Training. RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

Caused by op 'shuffle_batch', defined at:
File "train.py", line 211, in
app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 198, in main
central_crop_size=common_flags.get_crop_size())
File "/DATA/disk1/zyt/try_att/Attention-ocr-Chinese-Version/python/data_provider.py", line 193, in get_data
min_after_dequeue=shuffle_config.min_after_dequeue))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 1300, in shuffle_batch
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/input.py", line 846, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

OutOfRangeError (see above for traceback): RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

INFO 2018-08-08 10:55:02.000396: tf_logging.py: 115 Caught OutOfRangeError. Stopping Training. RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_UINT8, DT_INT64, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, ReduceJoin/reduction_indices)]]

INFO:tensorflow:Finished training! Saving model to disk.
INFO 2018-08-08 10:55:02.000401: tf_logging.py: 115 Finished training! Saving model to disk.
Traceback (most recent call last):
File "train.py", line 211, in
app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 207, in main
train(total_loss, init_fn, hparams)
File "train.py", line 155, in train
init_fn=init_fn)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 785, in train
ignore_live_threads=ignore_live_threads)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/supervisor.py", line 833, in stop
ignore_live_threads=ignore_live_threads)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/queue_runner_impl.py", line 252, in _run
enqueue_callable()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1244, in _single_operation_run
self._call_tf_sessionrun(None, {}, [], target_list, None)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 26880 values, but the requested shape has 57600

数据集与模型收敛问题

你好，想问下自有数据集生成的问题。现有模型需要的是600x150尺寸的图片数据集，假设生成50w+规模的文本片段，生成的数据集差不多120G左右，并且训练速度很慢。如果修改tfrecord里的图片尺寸，修改为200x50的图片，50w文本片段生成的数据集差不多20G左右，训练速度也相应加快很多。但是修改图片尺寸后，feature map提取出来是22x3的尺寸，按照模型在seq2seq的处理，有小一半的feature会被抛弃不被使用，导致loss飙升不下降。
想问下楼主，在训练大量文本片段中，有什么好的处理方法吗？

frozen problem

Hi, guys,
I want to freeze the model to a pb file, but got an error:

op_kernel.cc:1275 OP_REQUIRES failed at lookup_table_op.cc:675 : Failed precondition: Table not initialized.
Traceback (most recent call last):
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
    return fn(*args)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Table not initialized.
         [[Node: import/AttentionOcr_v1/index_to_string_Lookup = LookupTableFindV2[Tin=DT_INT64, Tout=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](import/AttentionOcr_v1/index_to_string, import/AttentionOcr_v1/ToInt64/_9, import/AttentionOcr_v1/index_to_string/Const)]]

Caused by op 'import/AttentionOcr_v1/index_to_string_Lookup', defined at:
  File "eval_own.py", line 61, in <module>
    model_test(path_pb, path_img)
  File "eval_own.py", line 31, in model_test
    graph = load_graph(path_pb)
  File "eval_own.py", line 24, in load_graph
    tf.import_graph_def(graph_def)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
    return func(*args, **kwargs)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
    _ProcessNewOps(graph)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
    for new_op in graph._add_new_tf_operations(compute_devices=False):  # pylint: disable=protected-access
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3289, in _add_new_tf_operations
    for c_op in c_api_util.new_tf_operations(self)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3289, in <listcomp>
    for c_op in c_api_util.new_tf_operations(self)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3180, in _create_op_from_tf_operation
    ret = Operation(c_op, self)
  File "/home/zhangsy/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
    self._traceback = tf_stack.extract_stack()

FailedPreconditionError (see above for traceback): Table not initialized.
         [[Node: import/AttentionOcr_v1/index_to_string_Lookup = LookupTableFindV2[Tin=DT_INT64, Tout=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](import/AttentionOcr_v1/index_to_string, import/AttentionOcr_v1/ToInt64/_9, import/AttentionOcr_v1/index_to_string/Const)]]

In fact, I modified demo_inference.py as follows:

def create_model(batch_size, dataset_name):
  width, height = get_dataset_image_size(dataset_name)
  dataset = common_flags.create_dataset(split_name=FLAGS.split_name)
  tf.initialize_all_tables(name='init_all_tables')
  model = common_flags.create_model(
    num_char_classes=dataset.num_char_classes,
    seq_length=dataset.max_sequence_length,
    num_views=dataset.num_of_views,
    null_code=dataset.null_code,
    charset=dataset.charset)
  #raw_images = tf.placeholder(tf.uint8, shape=[batch_size, height, width, 3], name='input_images') 
  raw_images_o = tf.placeholder(tf.float32, shape=[batch_size, height, width, 3], name='input_images_h')
  
  print("========================================================================")
  print("input : {}".format(raw_images_o))#AttentionOcr_v1/split
  print("========================================================================")

  raw_images = raw_images_o/255.0
  images = tf.map_fn(data_provider.preprocess_image, raw_images,
                     dtype=tf.float32)
  endpoints = model.create_base(images, labels_one_hot=None)
  return raw_images_o, endpoints


def run(checkpoint, batch_size, dataset_name, image_path_pattern):
  images_placeholder, endpoints = create_model(batch_size,
                                               dataset_name)
  images_data = load_images(image_path_pattern, batch_size,
                            dataset_name)
  
  session_creator = monitored_session.ChiefSessionCreator(
    checkpoint_filename_with_path=checkpoint)
  with monitored_session.MonitoredSession(
      session_creator=session_creator) as sess:
    
    
    output_graph=FLAGS.output_graph
    if output_graph != '':
      print("========================================================================")
      print("output : {} {}".format(endpoints.predicted_text, endpoints.predicted_scores))#Tensor("AttentionOcr_v1/ReduceJoin:0", shape=(1,), dtype=string) Tensor("AttentionOcr_v1/Reshape_2:0", shape=(?, 8), dtype=float32)
      print("========================================================================")
      output_graph_def = tf.graph_util.convert_variables_to_constants(sess, tf.get_default_graph().as_graph_def(),
                                                                          [endpoints.predicted_text.name[:-2], endpoints.predicted_scores.name[:-2]])
          # 保存为 pb 文件
      
      with tf.gfile.GFile(output_graph, 'wb') as f:
        f.write(output_graph_def.SerializeToString())
      print('%d ops in the final graph' % (len(output_graph_def.node)))
    #print(images_placeholder)
    predictions = sess.run([endpoints.predicted_text, endpoints.predicted_scores],
                           feed_dict={images_placeholder: images_data})


    #print(predictions)
  return predictions#[x.tolist() for x in predictions]

I search the internet but has no idea what can i do. Can you give me some suggestions?

Predict a random image??

How to predict a single image (random size)?? what file do you run
I run demo_reference.py file, it requere 600x150 include 4 image are combined
Please, thanks

Multi GPU training

How to do training with GPUs in single device?

GPU 训练问题

我使用GPU 训练一直报2018-04-20 14:26:01.064535: W tensorflow/core/framework/op_kernel.cc:1158] Resource exhausted: OOM when allocating tensor with shape[5751,5463] INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.ResourceExhaustedError'>, OOM when allocating tensor with shape[16,1152,1,288]
更改batch_size 也不管用，请问这个如何解决呢

最后调到batch_size = 2 才能训练。。这么占内存吗

gtx 1080ti 下OOM

问下这份代码是在什么gpu下训练的？
为什么1080ti 批次改到1都内存不够~

使用提供的模型出现shape不一致，需要改哪里

attention_ocr pretrained model

您好，请问您训练模型的时候用到的是一下哪一步呢？

run train.py 的时候报错:casting strings to float is not supported

您好！具体的error message如下：
UnimplementedError (see above for traceback): Cast string to float is not supported
[[Node: Momentum/update_AttentionOcr_v1/conv_tower_fn/INCE/InceptionV3/Conv2d_1a_3x3/weights/Cast_1 = CastDstT=DT_FLOAT, SrcT=DT_STRING, _class=["loc:@AttentionOcr_v1/conv_tower_fn/INCE/InceptionV3/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

问题出在这一句上
clip_gradient_norm=FLAGS.clip_gradient_norm
一开始怀疑是我生成dict的时候出了问题，后再换上了原来的dict和tfrecod_example 还是报一样的错。请问有人遇到过这种情况吗？

谢谢

请问这个程序为什么这么耗内存

训练的时候大量消耗内存，将近20G，修改lstm的hidden layer数目，减小batch_size等都没有明显效果。请问怎么修改程序才能大量减少内存消耗呢？

预测结果

再次请教，训练数据30多万，loss 降到60左右就不往下降了，预测生成的图像有的还可以，但预测切出来的图像，比如

就不好，结果是：7u1Ii░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
是训练还不够吗

fsns dataset

where to load the fsns dataset ?

How long does It takes to train chinese on CPU?

请问如何识别一张图片上的文字呢

我下载了您的工程，用的是工程里数据进行训练，已经训练了17630step，loss停留在43.96。我现在想识别一张图片上的文字测试看识别结果，但是在您的工程中没有找到相应的脚本。demo_inference.py运行报错： ValueError: could not broadcast input array from shape (21,272,3) into shape (150,600,3)

模型不收敛的可能原因

先说解决方法：在dic.txt里定义一个5462字符作为空字符填充（不要和你要识别的字符冲突）

一开始我的模型也是没法收敛，各种loss上天。后来在tensorboard的text里发现字符后面都是问号，想到可能是没有正确的补全字符，再到FSNS里的generate_tfrecord_JPG.py的56行发现了如下代码：

我没有在dic.txt里定义5462这个字空符作为空符填充，所以导致识别错误？改了就好了。
还没训练多少，已经能正常收敛了

batch问题

你好，我生成的训练集是按照不同长度文本区分开的，想要每个batch的图片长度类似，需要怎么操作？

手写数字的行识别，模型不收敛，越训练loss越大

训练数据集为手写数字的行识别：

https://pan.baidu.com/s/1VuwANOFG-9xZKxlPgE5NoA
tfrecord生成用的是FSNS-tfrecord-generate中的generate_tfrecord_JPG.py因为原始图片为单通道图，所以加了个.convert('RGB')
现在的情况是，越训练loss越大：
2018-07-10 21:53:03,294tf_logging.py INFO global step 100: loss = 239.5235 (0.493 sec/step)
2018-07-10 21:53:53,060tf_logging.py INFO global step 200: loss = 558.6710 (0.474 sec/step)
2018-07-10 21:54:01,557tf_logging.py INFO Recording summary at step 216.
2018-07-10 21:54:02,394tf_logging.py INFO global_step/sec: 1.84779
2018-07-10 21:54:42,837tf_logging.py INFO global step 300: loss = 699.3590 (0.487 sec/step)
2018-07-10 21:55:32,055tf_logging.py INFO global step 400: loss = 965.1036 (0.495 sec/step)
2018-07-10 21:56:01,369tf_logging.py INFO Recording summary at step 458.
2018-07-10 21:56:02,398tf_logging.py INFO global_step/sec: 2.0166
2018-07-10 21:56:21,898tf_logging.py INFO global step 500: loss = 2043.1371 (0.484 sec/step)
2018-07-10 21:57:12,395tf_logging.py INFO global step 600: loss = 1464.7139 (0.484 sec/step)
2018-07-10 21:58:01,521tf_logging.py INFO Recording summary at step 697.
2018-07-10 21:58:02,426tf_logging.py INFO global_step/sec: 1.98286
2018-07-10 21:58:03,066tf_logging.py INFO global step 700: loss = 2526.5117 (0.515 sec/step)
2018-07-10 21:58:53,065tf_logging.py INFO global step 800: loss = 2093.2444 (0.499 sec/step)
2018-07-10 21:59:43,422tf_logging.py INFO global step 900: loss = 3336.6711 (0.499 sec/step)
2018-07-10 22:00:01,613tf_logging.py INFO Recording summary at step 935.
2018-07-10 22:00:02,409tf_logging.py INFO global_step/sec: 1.98362
2018-07-10 22:00:34,249tf_logging.py INFO global step 1000: loss = 3454.0398 (0.530 sec/step)
2018-07-10 22:01:24,575tf_logging.py INFO global step 1100: loss = 4373.2217 (0.546 sec/step)

可以帮我分析下吗？我找了一整天，已经能试的都试过了

训练的时候会出现killed,然后就停了

247 (3.054 sec/step)
INFO:tensorflow:Saving checkpoint to path /home/chase/Downloads/Attention-ocr-Chinese-Version-master/python/logs/model.ckpt
INFO 2019-01-01 18:11:04.000964: tf_logging.py: 82 Saving checkpoint to path /home/chase/Downloads/Attention-ocr-Chinese-Version-master/python/logs/model.ckpt
Killed

原代码的问题！！！

你好。我正在试着跑google的Attention_ocr原代码。可是我在跑test文件的时候报错了。

错误是UnrecognizedFlagError: Unknown command line flag 'p'。

我想问一下。你有遇到过这个问题嘛？

预测结果

你好，你训练数据都什么样的？固定长度文本？还是随机长度？

模型不收敛问题

我用这个模型，数据用我自己的数据，车牌数据，模型loss降到40左右不再下降了，请问可能是什么问题？数据我也排查了一遍，应该是没有问题，总感觉这个模型哪里有bug

请问预先训练的权重如何关闭～

INFO 2018-11-28 15:48:45.000810: tf_logging.py: 116 Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Assign requires shapes of both tensors to match. lhs shape= [256,4523] rhs shape= [256,5462]

训练时预测结果正常，测试时结果不正常

你好，我在train.py的时候查看tensorboard，预测结果和真实值接近，较正常。但是同一个模型在demo_inference.py和eval.py时预测结果则和真实值相差较大，请问你有遇到类似的情况嘛~期待解答

Which TF and python version are you using?

i got error as below while use own dataset, don't know if due to the reason of version issue:

(tensorflow-gpu) J:\tensorflow-model\research\attention_ocr\python>python train.py --dataset_name=ctnsizedataset --checkpoint_inception=./attention_ocr_2017_08_09/model.ckpt-399731
Traceback (most recent call last):
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_flag.py", line 166, in _parse
return self.parser.parse(argument)
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_argument_parser.py", line 114, in parse
type(argument)))
TypeError: flag value must be a string, found "<class 'float'>"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 33, in
common_flags.define()
File "J:\tensorflow-model\research\attention_ocr\python\common_flags.py", line 77, in define
'momentum value for the momentum optimizer if used')
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\platform\flags.py", line 58, in wrapper
return original_function(*args, **kwargs)
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_defines.py", line 241, in DEFINE_string
DEFINE(parser, name, default, help, flag_values, serializer, **args)
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_defines.py", line 81, in DEFINE
DEFINE_flag(_flag.Flag(parser, serializer, name, default, help, **args),
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_flag.py", line 107, in init
self._set_default(default)
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_flag.py", line 196, in _set_default
self.default = self._parse(value)
File "e:\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\flags_flag.py", line 169, in _parse
'flag --%s=%s: %s' % (self.name, argument, e))
absl.flags._exceptions.IllegalFlagValueError: flag --momentum=0.9: flag value must be a string, found "<class 'float'>"

(tensorflow-gpu) J:\tensorflow-model\research\attention_ocr\python>