surfzjy / east-caffe Goto Github PK

View Code? Open in Web Editor NEW

17.0 3.0 8.0 9.97 MB

A Caffe implementation of EAST text detector

License: MIT License

Python 99.93% Shell 0.07%

east-caffe's Introduction

EAST: An Efficient and Accurate Scene Text Detector

Introduction

This is a CAFFE re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

thanks to these project:

The features are summarized blow:

OpenCV_DNN/ CAFFE inference demo.
Only RBOX part is implemented.
Use MobileNet_v3 as backbone.
NCNN/ MNN deploy support, Use NCNN int8 quantization, the model size can be 2M or less. Very suitable for deploy on Mobile devices.

Installation
Download
Train
Demo
Examples

Installation

Any version of caffe version > 1.0 should be ok. (suggest use the https://github.com/weiliu89/caffe/tree/ssd)
If the DiceCoefLoss Layer do not support, please recompile caffe with the Dice Coefficient Loss Layer (https://github.com/HolmesShuan/A-Variation-of-Dice-coefficient-Loss-Caffe-Layer) or use python version 'DiceCoefLossLayer' (The comment part in train.prototxt) as subsititution.
The ReLU6 layer implementation can be found in https://github.com/chuanqi305/ssd
Build geo_map_cython_lib ( Accelerate preprocessing in distance calculation)
```
cd geo_map_cython_lib
sh build_ext.sh
```

Download

Models trained on ICDAR 2013 (training set)

https://pan.baidu.com/s/1_daEvvt7ur3FdXVxVKSF9A ( extract code：krdu )

Models trained on ICDAR 2015 (training set)

https://pan.baidu.com/s/1DLTJDiRIqihE6ad5uiHEFA ( extract code：pn0w )

Models trained on fake_idcard with single character annotation

https://pan.baidu.com/s/1KpG7xFPChyJMftAGR2SdYw ( extract code：m70q )

Train

If you want to train the model, you should provide the dataset path, in the dataset path, the images and the gt text files should be separated into two filefolders as shown as below:

train_images\   train_gts\   test_images\   test_gts\

and the gts content format is

x1,y1,x2,y2,x3,y3,x4,y4,recog_results

and run

python train.py --gpu 0 --initmodel my_model.caffemodel

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu 0,1,2,3)

Demo

Put the pretrained model into snapshot directory.

Then run

python demo.py

Examples

east-caffe's People

Contributors

Stargazers

Watchers

Forkers

sunjunlishi yiran-thu corleonechensiyu qaz734913414 apulis loicvz190 wangyx-tkz webstorage119

east-caffe's Issues

The performance of trained model is pool in train set.

I have trained this model in ic15 with code in this repository.
After 69100 iteration training, the loss is reduced to 0.00333686.

I1024 22:11:46.818552 4375 sgd_solver.cpp:105] Iteration 69180, lr = 0.001
I1024 22:11:53.286249 4375 solver.cpp:218] Iteration 69190 (1.54621 iter/s, 6.46743s/10 iters), loss = 0.00333686
I1024 22:11:53.286276 4375 solver.cpp:237] Train net output #0: Loss_rbox = 0.00646377 (* 1 = 0.00646377 loss)
I1024 22:11:53.286281 4375 solver.cpp:237] Train net output #1: Loss_score = 0.00405699 (* 0.01 = 4.05699e-05 loss)

However, The performance of trained model is pool even in train set.

Caffe API test result is different from dnn api

I run my model by demo.py
The model is trained by caffe
The result seems right when I choose dnn api ,but it is wrong when I swap to caffe api.
How to solve this problem?

最终层的输出结果是NAN

同学你好，感谢你开源EAST-caffe，很漂亮的工作。
但是我在调试的时候，用caffe作为框架，发现最终层（f_score和geo_concat）的值全部都是NAN。于是我逐层输出每一层的feature map，发现conv_11_pw_hswish开始的输出中包含INF，并且之后的层里INF越来越多，导致最终无法得到想要的结果。

我的系统是ubuntu16，caffe是weiliu89/SSD的caffe，opencv版本4.1.1，RELU6层直接用普通的RELU代替了。
请问下你的RELU6层自己重新实现的，还是也用的普通RELU代替？

Training model and documentation

@SURFZJY What is the code status?
will you add documentation and pre-trained models?

Traceback (most recent call last):
File "train.py", line 27, in
train(args.initmodel, args.gpu)
File "train.py", line 15, in train
solver = caffe.AdamSolver('solver.prototxt')
File "/home/csy/TextField/examples/EAST-caffe/pylayerUtils.py", line 49, in reshape
self.data, self.score_map, self.geo_map = self.load()
ValueError: too many values to unpack
看了下def get_whole_data（）函数，return了5个参数
if len(images) == batch_size:
return images, image_fns, score_maps, geo_maps, training_masks
images = []
image_fns = []
score_maps = []
geo_maps = []
training_masks = []
这里我去掉了training_masks和image_fns，images中的任意一个,出现这个错误，所以返还的参数是否有误.
*Traceback (most recent call last):
File "train.py", line 27, in
train(args.initmodel, args.gpu)
File "train.py", line 15, in train
solver = caffe.AdamSolver('solver.prototxt')
File "/home/csy/TextField/examples/EAST-caffe/pylayerUtils.py", line 45, in reshape
top[0].reshape(self.data.shape)
AttributeError: 'list' object has no attribute 'shape'

无法启动训练

你好，我是用icdar15数据集进行训练，每次都卡在 conv1_1, 很长时间后，就自动退出了，请问你遇到这样的问题了么？

English download?

Hi there,

I'm afraid I only speak English. I would like to download your trained models but I can't understand the download page. How can I do that (apart from learning a new language)?

Thanks.

运行训练的环境

这项目太棒了，给学习文本定位技术一个非常好的源码可学习。我在ubuntu 16.04 LTS, python 2.7，ssd-chuanqi305的caffe环境下，编译geo_map_cython_lib, caffe, pycaffe成功，运行出现问题，错误代码如下

$ python train.py --dataset ic15 -gpu 0
 File "/home/link/EAST-caffe/icdar.py", line 13, in 
    from geo_map_cython_lib import gen_geo_map
ImportError: No module named geo_map_cython_lib

请问是运行环境不一样吗，还是我哪里出现了错误?