Git Product home page Git Product logo

east-caffe's Introduction

EAST: An Efficient and Accurate Scene Text Detector

Introduction

This is a CAFFE re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

thanks to these project:

The features are summarized blow:

  • OpenCV_DNN/ CAFFE inference demo.
  • Only RBOX part is implemented.
  • Use MobileNet_v3 as backbone.
  • NCNN/ MNN deploy support, Use NCNN int8 quantization, the model size can be 2M or less. Very suitable for deploy on Mobile devices.

Contents

  1. Installation
  2. Download
  3. Train
  4. Demo
  5. Examples

Installation

  1. Any version of caffe version > 1.0 should be ok. (suggest use the https://github.com/weiliu89/caffe/tree/ssd)

  2. If the DiceCoefLoss Layer do not support, please recompile caffe with the Dice Coefficient Loss Layer (https://github.com/HolmesShuan/A-Variation-of-Dice-coefficient-Loss-Caffe-Layer) or use python version 'DiceCoefLossLayer' (The comment part in train.prototxt) as subsititution.

  3. The ReLU6 layer implementation can be found in https://github.com/chuanqi305/ssd

  4. Build geo_map_cython_lib ( Accelerate preprocessing in distance calculation)

    cd geo_map_cython_lib
    sh build_ext.sh

Download

  1. Models trained on ICDAR 2013 (training set)

https://pan.baidu.com/s/1_daEvvt7ur3FdXVxVKSF9A ( extract code:krdu )

  1. Models trained on ICDAR 2015 (training set)

https://pan.baidu.com/s/1DLTJDiRIqihE6ad5uiHEFA ( extract code:pn0w )

  1. Models trained on fake_idcard with single character annotation

https://pan.baidu.com/s/1KpG7xFPChyJMftAGR2SdYw ( extract code:m70q )

Train

If you want to train the model, you should provide the dataset path, in the dataset path, the images and the gt text files should be separated into two filefolders as shown as below:

train_images\   train_gts\   test_images\   test_gts\

and the gts content format is

x1,y1,x2,y2,x3,y3,x4,y4,recog_results

and run

python train.py --gpu 0 --initmodel my_model.caffemodel

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu 0,1,2,3)

Demo

Put the pretrained model into snapshot directory.

Then run

python demo.py 

Examples

demo on ic13

demo on ic15

demo on idcard

east-caffe's People

Contributors

surfzjy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

east-caffe's Issues

The performance of trained model is pool in train set.

I have trained this model in ic15 with code in this repository.
After 69100 iteration training, the loss is reduced to 0.00333686.

I1024 22:11:46.818552 4375 sgd_solver.cpp:105] Iteration 69180, lr = 0.001
I1024 22:11:53.286249 4375 solver.cpp:218] Iteration 69190 (1.54621 iter/s, 6.46743s/10 iters), loss = 0.00333686
I1024 22:11:53.286276 4375 solver.cpp:237] Train net output #0: Loss_rbox = 0.00646377 (* 1 = 0.00646377 loss)
I1024 22:11:53.286281 4375 solver.cpp:237] Train net output #1: Loss_score = 0.00405699 (* 0.01 = 4.05699e-05 loss)

However, The performance of trained model is pool even in train set.
ic

最终层的输出结果是NAN

同学你好,感谢你开源EAST-caffe,很漂亮的工作。
但是我在调试的时候,用caffe作为框架,发现最终层(f_score和geo_concat)的值全部都是NAN。于是我逐层输出每一层的feature map,发现conv_11_pw_hswish开始的输出中包含INF,并且之后的层里INF越来越多,导致最终无法得到想要的结果。

我的系统是ubuntu16,caffe是weiliu89/SSD的caffe,opencv版本4.1.1,RELU6层直接用普通的RELU代替了。
请问下你的RELU6层自己重新实现的,还是也用的普通RELU代替?

DataLyer 数据处理出错

Traceback (most recent call last):
File "train.py", line 27, in
train(args.initmodel, args.gpu)
File "train.py", line 15, in train
solver = caffe.AdamSolver('solver.prototxt')
File "/home/csy/TextField/examples/EAST-caffe/pylayerUtils.py", line 49, in reshape
self.data, self.score_map, self.geo_map = self.load()
ValueError: too many values to unpack

看了下def get_whole_data()函数,return了5个参数
if len(images) == batch_size:
return images, image_fns, score_maps, geo_maps, training_masks
images = []
image_fns = []
score_maps = []
geo_maps = []
training_masks = []

这里我去掉了training_masks和image_fns,images中的任意一个,出现这个错误,所以返还的参数是否有误.
*Traceback (most recent call last):
File "train.py", line 27, in
train(args.initmodel, args.gpu)
File "train.py", line 15, in train
solver = caffe.AdamSolver('solver.prototxt')
File "/home/csy/TextField/examples/EAST-caffe/pylayerUtils.py", line 45, in reshape
top[0].reshape(self.data.shape)
AttributeError: 'list' object has no attribute 'shape'

无法启动训练

你好,我是用icdar15数据集进行训练,每次都卡在 conv1_1, 很长时间后,就自动退出了,请问你遇到这样的问题了么?

English download?

Hi there,

I'm afraid I only speak English. I would like to download your trained models but I can't understand the download page. How can I do that (apart from learning a new language)?

Thanks.

运行训练的环境

这项目太棒了,给学习文本定位技术一个非常好的源码可学习。我在ubuntu 16.04 LTS, python 2.7,ssd-chuanqi305的caffe环境下,编译geo_map_cython_lib, caffe, pycaffe成功,运行出现问题,错误代码如下

$ python train.py --dataset ic15 -gpu 0
 File "/home/link/EAST-caffe/icdar.py", line 13, in 
    from geo_map_cython_lib import gen_geo_map
ImportError: No module named geo_map_cython_lib

请问是运行环境不一样吗,还是我哪里出现了错误?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.