Git Product home page Git Product logo

psenet's Introduction

psenet's People

Contributors

rosesakurai avatar whai362 avatar xieenze avatar yeshenglong1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

psenet's Issues

输出检测结果坐标问题

输出的坐标结果生成了txt文件,但是发现每个文本检测的结果的起始坐标不固定,有的是左下顶点开始,有的是右下顶点开始。请问这个TXT每一行的结果 的 起始顶点如何固定一下呢?

关于弯曲文本数据集的检测效果

您好,目前的代码中只看到了ICDAR2015数据集的导入、评估和测试,请问针对弯曲文本数据集(CTW-1500或者Total-Text)的要如何测试?

where does gt.zip come from, and any suggest of this result?

  1. Train with ICDAR2015 Ch4 with 600 epoch and batch_size change to 32, the log result looks like this:
    0.000010 0.394331 0.889751 0.852450
    0.000010 0.416218 0.898702 0.863312
    0.000010 0.406228 0.874345 0.837213
    0.000010 0.378316 0.900977 0.864393

2.Testing found the result some intrersting labels like this:
image

  1. Eval result is
    Calculated!{"recall": 0.0, "precision": 0.0, "hmean": 0, "AP": 0}
    Where is wrong, all is following with README, only change the batch_size from 16 to 32, and the train result fit your log recorder.

很棒的工作

请问,有代码复现什么的么?对这个工作 ,我很有兴趣...期待代码和论文 进一步放出

About OHEM

Dear author,
It's a honor for me to read your work about Shape Robust Text Detection with Progressive Scale Expansion Network, which is an excellent work. However, I am a little confused about how to apply the OHEM in the task of segmentation as it is initially designed for detection.

Have to maintain high resolution to get good result.

Hi, We have implemented your method using tensorflow. We find that to get good result, we have to resize the image to very big size, so it’s not so efficient in practice. We now use your method to detect large angle long text, for normal text and horizontal long text lines, we have much faster method.

ctw1500加载

def get_bboxes(img, gt_path):
    h, w = img.shape[0:2]
    lines = util.io.read_lines(gt_path)
    bboxes = []
    tags = []
    for line in lines:
        line = util.str.remove_all(line, '\xef\xbb\xbf')
        gt = util.str.split(line, ',')

        x1 = np.int(gt[0])
        y1 = np.int(gt[1])

        bbox = [np.int(gt[i]) for i in range(4, 32)]
        bbox = np.asarray(bbox) + ([x1 * 1.0, y1 * 1.0] * 14)
        bbox = np.asarray(bbox) / ([w * 1.0, h * 1.0] * 14)
        
        bboxes.append(bbox)
        tags.append(True)
    return np.array(bboxes), tags

image
请问ctw-1500标注文件格式是什么,没有找到详细的解释,每行32个值,14个点是28个坐标值,那多出的4个值是什么

ls_loss的疑问

ignore the pixels of non-text region in the segmentation result Sn to avoid a certain
redundancy.

其中用到 Sn > 0.5 的参与计算,但是前期Sn应该预测不到结果,那么ls_loss岂不是为0?
是否有更多的细节?

fps

运行作者的test_ic15.py,fps只有0.65,我的GPU是k40c 12G的。请问怎么提高速度呢?

网络输出

网络输出最后为什么不适用sigmoid 而是使用的outputs = (torch.sign(outputs - args.binary_th) + 1) / 2

about icdar2015

hi, i want to know your result in icdar2015 used fine-tune on other dataset,because myself result is 76% only training on icdar2015

Training Data used

Hi

Thanks for sharing your work,The model link that is posted, was that model trained only on Icdar2015 or is it pre-trained on Imagenet/Synthtext.

Thanks in advance.

model

您好,模型链接都可以打开和下载,百度云和OneDrive的都可以,但是下载之后 文件是损坏的,解压报错。我和其他的人的电脑都试了,都是这样的。 @whai362

ERRROR: PyUnicodeUCS2_AsUTF8String

除了重新编译python,不知有没有其他解决方法?
Traceback (most recent call last):
File "/PSENet/test_ic15.py", line 19, in
from pse import pse
File "
/PSENet/pse/init.py", line 11, in
from .adaptor import pse as cpse
ImportError: */PSENet/pse/adaptor.so: undefined symbol: PyUnicodeUCS2_AsUTF8String
environment:
conda
Python 2.7.13

the data of mlt2017 used in pretrained

the trained model trained on icdar2015 using pretrained model on mlt2017, can not detect chinese words. the pretrained trained models on mlt2017 didn't use the chinese datasets? which datasets of mlt2017 used in pretraining? thanks.

About the setting of your experiment

I use the resnet152 as backbone, and the batch size is 16x3x640x640 which is advised by the paper. I use NVIDIA k40 whose memory is 12G. But it raised "Out of Memory". I see that your experiment is based on the NVIDIA 1080TI whose memory is only 11g. Can you provide some details about the settings of the experiment?

About pretraining on ICDAR2017

Dear Author:
Thanks for the release of code!
However, I'm a little confused about the pretraining on ICDAR2017, which is mentioned in the list of results in github. It is a little different from the setting in the paper in which ICDAR2017 and ICDAR2015 are mixed as a whole training data. Can you provide more training details(epoch, lr, lr_scheluder and so on) about pretaining precess of ICDAR2017? Thanks!
Best Wishes!

is my result right ?

resnet50 1s no pre-train,author provide weight ic_2015
Calculated!{"recall": 0.7881559942224362, "precision": 0.8309644670050761, "hmean": 0.80899431677786, "AP": 0}

data augmentation

在论文中,提到了
image,请问这里是直接从图片中crop出640*640的图片,还是向east那样随机 crop一个区域然后进行宽高直接resize到640?

Pretrained model "resnet101"

when train with option --arch='resnet101',error raised
File "/root/data/workspace/PSENet/models/fpn_resnet.py", line 477, in resnet101
pretrained_model = model_zoo.load_url(model_urls['resnet101'])
File "/usr/local/lib/python2.7/dist-packages/torch/utils/model_zoo.py", line 65, in load_url
hash_prefix = HASH_REGEX.search(filename).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Is the link invalid?

license?

please add a proper license file so the code can be reused legally. MIT, Apache 2 or 3-clause BSD seem to be the most popular choices.

(also you added code from wkentaro/pytorch-fcn which is MIT - so maybe MIT it is?)

关于训练速度的问题

GPU为1080x2,batchsize=10,num_work=0,用大概三万张图像的数据集来训练,速度感人,根据上面显示的时间,训练400轮需要166天左右。请问这属于正常情况吗?或者有什么方法可以提速。
}KXH97~_GBWT1 GY8~L_CDK_WPS图片

about icdar2017

Could you tell me the detail of your result in icdar2017?

Label Generation

作者你好,请问能不能提供Label Generation的代码?

code

When to upload code

前五层结果不对

你好,我使用n=6,在训练过程中发现前五层的结果不如最后一层结果的好,目前我迭代了20个epoch,是不是还需要继续迭代还是说Ls的损失写的有问题啊,Ls中的W一定要有吗?
capture_2019129_163106
capture_2019129_163224
capture_2019129_163136
capture_2019129_163149
capture_2019129_163158
capture_2019129_163209
capture_2019129_163237

A mistake about scale

in test_ic15.py
line 143,scale = (org_img.shape[0] * 1.0 / pred.shape[0], org_img.shape[1] * 1.0 / pred.shape[1])
it maybe wrong?
shape[0]->h
shape[1]->w
so, I think it is
scale = (org_img.shape[1] * 1.0 / pred.shape[1], org_img.shape[0] * 1.0 / pred.shape[0]) @whai362

test_ic15

运行test_ic15.py报错如下

ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
但我的cuda是9.2,为什么它去找cuda8呢?望大牛帮助求解 @whai362

a problem about pse

当我在编译 pse相关文件的时候遇到了这个报错
adaptor.cpp:35:86: error: ‘connectedComponents’ was not declared in this scope
我的环境中的opencv 分别是pip 安装的opencv-python 版本为4.0.0 和 4.1.0
gcc g++ 版本为5.4
请问这个问题是怎么回事呢?
是否需要安装低版本的opencv 或者是需要 从源码编译opencv
@whai362

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.