astarlight / cps-ocr-engine Goto Github PK
View Code? Open in Web Editor NEWAn awesome OCR engine developed by SYSU DeepDriving Lab
An awesome OCR engine developed by SYSU DeepDriving Lab
代码的执行结果与Readme中的结果并不一致。
请问作者在遇到非联通字体和标点符号时是如何处理的?
def get_label_dict():
f = open('./chinese_labels', 'r')
label_dict = pickle.load(f)
# label_dict = str.encode(pickle.load(f))
f.close()
return label_dict
提示下面的错误:
Traceback (most recent call last):
File "F:/OCR/CPS-OCR-Engine-master/ocr/Chinese_OCR.py", line 411, in
tf.app.run()
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "F:/OCR/CPS-OCR-Engine-master/ocr/Chinese_OCR.py", line 390, in main
label_dict = get_label_dict()
File "F:/OCR/CPS-OCR-Engine-master/ocr/Chinese_OCR.py", line 335, in get_label_dict
label_dict = pickle.load(f)
TypeError: a bytes-like object is required, not 'str'
这个该怎么改?
谢谢
Hi,
现在各种OCR识别都是在云端的,Tess倒是可以本地识别,但太烂,您的这个有没有办法一直到Android 端,本地识别是因为,要在无网络的环境中使用,不得已如此
楼主你好,我现在只针对黑体汉字做训练,是不是只需保留font里面的heiti.ftt就可以了,需要修改label文件吗,如果没有gpu,是否可以用cpu训练呢,请您给我一些建议
python2.7 显卡是GTX960
1、训练成功了,但是识别tmp,目录的文字会是失败,(如下)不管什么字,都识别成一字??? 重新训练一次,识别的结果会不同,但是所有字都识别成了一个字的错误仍在
[the result info] image: ./tmp/0000.jpg predict: 一 丁 七; predict index [[0 1 2]] predict_val [[nan nan nan]]
[the result info] image: ./tmp/0001.jpg predict: 一 丁 七; predict index [[0 1 2]] predict_val [[nan nan nan]]
[the result info] image: ./tmp/0002.jpg predict: 一 丁 七; predict index [[0 1 2]] predict_val [[nan nan nan]]
2、使用作者的模型就可以正常识别
3、gpu版本和cpu版本都试过了,一样的问题
4、我有修改过程序的batch,因为原来的128会出错,改成了32
Traceback (most recent call last):
File "gen_printed_char.py", line 380, in
label_dict = get_label_dict()
File "gen_printed_char.py", line 304, in get_label_dict
label_dict = pickle.load(f)
TypeError: a bytes-like object is required, not 'str'
win10和Ubantu都报这个错,不知道为什么
Traceback (most recent call last):
File "Chinese_OCR.py", line 48, in
tf.app.flags.DEFINE_boolean('batch_size', 128, 'Validation batch size')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/flags. py", line 58, in wrapper
return original_function(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/absl/flags/_defines.py", line 267 , in DEFINE_boolean
DEFINE_flag(_flag.BooleanFlag(name, default, help, **args),
File "/usr/local/lib/python2.7/dist-packages/absl/flags/_flag.py", line 293, i n init
p, None, name, default, help, short_name, 1, **args)
File "/usr/local/lib/python2.7/dist-packages/absl/flags/_flag.py", line 107, i n init
self._set_default(default)
File "/usr/local/lib/python2.7/dist-packages/absl/flags/_flag.py", line 196, i n _set_default
self.default = self._parse(value)
File "/usr/local/lib/python2.7/dist-packages/absl/flags/_flag.py", line 169, i n _parse
'flag --%s=%s: %s' % (self.name, argument, e))
absl.flags._exceptions.IllegalFlagValueError: flag --batch_size=128: ('Non-boole an argument to boolean flag', 128)
请问遇到这个问题如何解决的?
想请问下为什么特意把训练数据设置成黑底白字?
感觉白底黑字的话因为array大部分都是0是不是对训练效率有提升呢?对识别准确度有影响吗?
in file gen_printed_char.py line 394
if rotate < 0:
roate = - rotate
should be
if rotate < 0:
rotate = - rotate
请问chinese_labels是怎么生成的呢?
想问下数十类票据分类的大致思路是什么样的啊?
有没有可能用深度学习的方法做到自动分类?不用去配置题头区域啥的,我看百度的自定义模板是用至少四个参照区域 http://ai.baidu.com/forum/topic/show/497372
还未解决,有人解决了吗
我看gen_printed_char 产生的都是30x30的分辨率,对有些复杂的汉字,这个分辨率是不是有点低呢?后面又resize为64x64的图片,我能不能直接 gen_printed_char 产生出 64x64的图片呢?网络不需要修改吧?
Dear AstarLight:
I am using some low resolution image(13*13) to train the Engine. I am quite confused about why and which of the following way is better.
usage: gen_printed_char.py [-h] ---out_dir OUT_DIR --font_dir FONT_DIR
[--test_ratio TEST_RATIO] --width 30 --height 30
[--no_crop] [--margin MARGIN] [--rotate ROTATE]
[--rotate_step ROTATE_STEP] [--need_aug]
gen_printed_char.py: error: the following arguments are required: ---out_dir, --font_dir, --width, --height
出现了这个错误,大家都是改动了哪里的代码啊
TypeError: Using a tf.Tensor
as a Python bool
is not allowed. Use if t is not None:
instead of if t:
to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
请问作者,chinese_labels文件是如何生成的?打开该文件都是一堆数字,并没有看到汉字,不知道如何生成该文件,还请指导!!感谢感谢!
如题
在chinese_labels中怎么添加英文字母和阿拉伯数字标签,有没有大佬给个方案。
/Applications/anaconda3/envs/tensorflow/bin/python3 /Users/fsl/CPS-OCR-Engine/ocr/Chinese_OCR.py --mode=train --max_steps=16002 --eval_steps=100 --save_steps=500
2019-01-25 15:21:59.993717: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:From /Users/fsl/CPS-OCR-Engine/ocr/Chinese_OCR.py:91: slice_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(tuple(tensor_list)).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
train
Begin training
./dataset/train/03755
./dataset/test/03755
WARNING:tensorflow:From /Applications/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/input.py:372: range_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.range(limit).shuffle(limit).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From /Applications/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/input.py:318: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From /Applications/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensors(tensor).repeat(num_epochs)
.
WARNING:tensorflow:From /Applications/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/input.py:197: QueueRunner.init (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
WARNING:tensorflow:From /Applications/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/training/input.py:197: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
WARNING:tensorflow:From /Users/fsl/CPS-OCR-Engine/ocr/Chinese_OCR.py:101: shuffle_batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.shuffle(min_after_dequeue).batch(batch_size)
.
WARNING:tensorflow:From /Users/fsl/CPS-OCR-Engine/ocr/Chinese_OCR.py:187: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
:::Training Start:::
之后就一直卡在这里,跑了一个下午都没往下,这是什么问题呢?谢谢。
直接测试报了这样一个错误,大家有没有遇到
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value conv3_5/BatchNorm/moving_mean
[[Node: conv3_5/BatchNorm/moving_mean/read = IdentityT=DT_FLOAT, _class=["loc:@conv3_5/BatchNorm/moving_mean"], _device="/job:localhost/replica:0/task:0/gpu:0"]]
the step 5799.0 takes 8.73126506805 loss 0.443596959114
the step 5800.0 takes 8.83500409126 loss 0.442320615053
the step 5801.0 takes 8.59880495071 loss 0.410989582539
===============Eval a batch=======================
the step 5801.0 test accuracy: 0.0
===============Eval a batch=======================
the step 5802.0 takes 8.64341807365 loss 0.505015850067
the step 5803.0 takes 8.5961420536 loss 0.405117839575
the step 5804.0 takes 8.67368292809 loss 0.427048921585
我训练了5800+次了,我的损失函数和readme中的曲线一致,但是怎么我的accuracy一致都是0.0 啊?不是应该不断上升么?
第45行的tf.app.flags.DEFINE_boolean('batch_size', 128, 'Validation batch size'),在OS,python3.6,tensorflow1.4中返回错误,必需是一个BOOL量
另外想请教一下博主,在输出中,ACCURACY始终是0.0,但是LOSS数值在减少,这个正常吗?
谢谢
the code about generating the data_set is not the same as the size in your training code,
python gen_printed_char.py --out_dir ./dataset --font_dir ./chinese_fonts --width 30 --height 30 --margin 4 --rotate 30 --rotate_step 1
so,the size of 30*30 will get an error in the next step of training the model.
the right way to get the true model is to change the code into
python gen_printed_char.py --out_dir ./dataset --font_dir ./chinese_fonts --width 64 --height 64 --margin 4 --rotate 30 --rotate_step 1
为什么一直出现 no module named "PIL" 我已经安装过了 PIL库 脚本里没报错,但是cmd时出错
有问题想请教您,我导师给我的毕业论文方向跟您做的很接近,但是我不懂,您可以指导我一下吗?毕业不了,很烦
3754 龟
一 0
Traceback (most recent call last):
File "gen_printed_char.py", line 424, in
image = font2image.do(verified_font_path, char, rotate=k)
File "gen_printed_char.py", line 295, in do
np_img)
File "gen_printed_char.py", line 210, in do
ret_img = self.put_img_into_center(norm_img, resized_cv2_img)
File "gen_printed_char.py", line 165, in put_img_into_center
start_width:start_width + width_small] = img_small
TypeError: slice indices must be integers or None or have an index method
就算有红章的干扰也能识别全对,是根据通道把红色的部分先去掉了吗
有好多人提到accuracy不变的问题,我也遇到了这个问题,accuracy=0,我用的是window系统。编码格式的问题我查过了,没问题,当我调整分类的类别数时,发现accuracy不再是0,分类数为2时,accuracy在0.5附近,分类数为10时,accuracy在0.1附近,这说明训练集或测试集用的是一个标签,但我又仔细检查标签后发现标签也没问题,最终没有找出原因,不知能否麻烦您帮我分析一下到底是什么原因导致的accuracy出错,谢谢
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value fc2/weights
[[node fc2/weights/read (defined at Chinese_OCR.py:133) ]]
[[node TopKV2 (defined at Chinese_OCR.py:153) ]]
help me, thanks
如题,谢谢你的开源
只看出作者使用的是py2,所以麻烦作者能告知如何搭建环境,谢谢
When try to trian, got this error:
IllegalFlagValueError: flag --batch_size=128: ('Non-boolean argument to boolean flag', 128)
这个是基于Tensorflow吗?
Traceback (most recent call last):
File "Chinese_OCR.py", line 411, in
tf.app.run()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "Chinese_OCR.py", line 381, in main
train()
File "Chinese_OCR.py", line 176, in train
train_feeder = DataIterator(data_dir='./dataset/train/')
File "Chinese_OCR.py", line 69, in init
self.labels = [int(file_name[len(data_dir):].split(os.sep)[0]) for file_name in self.image_names]
ValueError: invalid literal for int() with base 10: '.DS_Store'
Caused by op 'fc2/BatchNorm/moving_mean/read', defined at:
File "Chinese_OCR.py", line 460, in
tf.app.run()
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "Chinese_OCR.py", line 444, in main
final_predict_val, final_predict_index = inference(name_list)
File "Chinese_OCR.py", line 406, in inference
graph = build_graph(top_k=3)
File "Chinese_OCR.py", line 159, in build_graph
scope='fc2')
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1866, in fully_connected
outputs = normalizer_fn(outputs, **normalizer_params)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 597, in batch_norm
scope=scope)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 354, in _fused_batch_norm
collections=moving_mean_collections)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 350, in model_variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 277, in variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1479, in get_variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 1220, in get_variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 530, in get_variable
return custom_getter(**custom_getter_kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1750, in layer_variable_getter
return _model_variable_getter(getter, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1741, in _model_variable_getter
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 350, in model_variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
return func(*args, **current_args)
File "/usr/local/lib/python3.5/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 277, in variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 499, in _true_getter
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 911, in _get_single_variable
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 213, in call
return cls._variable_v1_call(*args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 176, in _variable_v1_call
aggregation=aggregation)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 155, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 217, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 1395, in init
constraint=constraint)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/variables.py", line 1557, in _init_from_args
self._snapshot = array_ops.identity(self._variable, name="read")
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 81, in identity
ret = gen_array_ops.identity(input, name=name)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3890, in identity
"Identity", input=input, name=name)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value fc2/BatchNorm/moving_mean
[[node fc2/BatchNorm/moving_mean/read (defined at Chinese_OCR.py:159) ]]
请问数据在哪里下载呢?没有对应的文件夹 datasets/train
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.