whai362 / psenet Goto Github PK
View Code? Open in Web Editor NEWOfficial Pytorch implementations of PSENet.
License: Apache License 2.0
Official Pytorch implementations of PSENet.
License: Apache License 2.0
Can you open source the progressive scale expansion algorithm alone?
How to deal the coords of skewed text and curved text datasets if they are trained together? Because there are many types of irregular texts in words. Make 4 pairs of coords of ctw1500 ? what about the inference of superfluous background ?
By the way, can you explain the coords of ctw1500 in detail ? thanks a lot.
看作者代码icdar2015_loader.py,发现其resize图片是用random_crop函数实现的,请问其原理是什么,这样resize合理么?
有没有大神告知一下,谢谢,万分感谢。
输出的坐标结果生成了txt文件,但是发现每个文本检测的结果的起始坐标不固定,有的是左下顶点开始,有的是右下顶点开始。请问这个TXT每一行的结果 的 起始顶点如何固定一下呢?
您好,目前的代码中只看到了ICDAR2015数据集的导入、评估和测试,请问针对弯曲文本数据集(CTW-1500或者Total-Text)的要如何测试?
Hi,
I cannot access baidu yun. It would be much appreciated if anyone can share the models on dropbox, google drive or onedrive.
Thanks.
2.Testing found the result some intrersting labels like this:
请问,有代码复现什么的么?对这个工作 ,我很有兴趣...期待代码和论文 进一步放出
Dear author,
It's a honor for me to read your work about Shape Robust Text Detection with Progressive Scale Expansion Network, which is an excellent work. However, I am a little confused about how to apply the OHEM in the task of segmentation as it is initially designed for detection.
@whai362 where is the code for PSENet?
also include documentation on how to train.
Hi, We have implemented your method using tensorflow. We find that to get good result, we have to resize the image to very big size, so it’s not so efficient in practice. We now use your method to detect large angle long text, for normal text and horizontal long text lines, we have much faster method.
坐等开源哦
hello , i am confusing about your fomular for calculation of d , which use r*r . may i ask the relationship and prof of r and d?.thank you!
def get_bboxes(img, gt_path):
h, w = img.shape[0:2]
lines = util.io.read_lines(gt_path)
bboxes = []
tags = []
for line in lines:
line = util.str.remove_all(line, '\xef\xbb\xbf')
gt = util.str.split(line, ',')
x1 = np.int(gt[0])
y1 = np.int(gt[1])
bbox = [np.int(gt[i]) for i in range(4, 32)]
bbox = np.asarray(bbox) + ([x1 * 1.0, y1 * 1.0] * 14)
bbox = np.asarray(bbox) / ([w * 1.0, h * 1.0] * 14)
bboxes.append(bbox)
tags.append(True)
return np.array(bboxes), tags
ignore the pixels of non-text region in the segmentation result Sn to avoid a certain
redundancy.
其中用到 Sn > 0.5 的参与计算,但是前期Sn应该预测不到结果,那么ls_loss岂不是为0?
是否有更多的细节?
运行作者的test_ic15.py,fps只有0.65,我的GPU是k40c 12G的。请问怎么提高速度呢?
您好,用makefile编译完adaptor.so后,调用会报错“undefined symbol: _Py_ZeroStruct”,是怎么回事呀?
你们有遇到吗?
谢谢
网络输出最后为什么不适用sigmoid 而是使用的outputs = (torch.sign(outputs - args.binary_th) + 1) / 2
呢
hi, i want to know your result in icdar2015 used fine-tune on other dataset,because myself result is 76% only training on icdar2015
Hi
Thanks for sharing your work,The model link that is posted, was that model trained only on Icdar2015 or is it pre-trained on Imagenet/Synthtext.
Thanks in advance.
I find the fpn can't improve results,is it right
what is the f-score about rctw2017, can you tell us some details?
您好,模型链接都可以打开和下载,百度云和OneDrive的都可以,但是下载之后 文件是损坏的,解压报错。我和其他的人的电脑都试了,都是这样的。 @whai362
除了重新编译python,不知有没有其他解决方法?
Traceback (most recent call last):
File "/PSENet/test_ic15.py", line 19, in
from pse import pse
File "/PSENet/pse/init.py", line 11, in
from .adaptor import pse as cpse
ImportError: */PSENet/pse/adaptor.so: undefined symbol: PyUnicodeUCS2_AsUTF8String
environment:
conda
Python 2.7.13
the trained model trained on icdar2015 using pretrained model on mlt2017, can not detect chinese words. the pretrained trained models on mlt2017 didn't use the chinese datasets? which datasets of mlt2017 used in pretraining? thanks.
I use the resnet152 as backbone, and the batch size is 16x3x640x640 which is advised by the paper. I use NVIDIA k40 whose memory is 12G. But it raised "Out of Memory". I see that your experiment is based on the NVIDIA 1080TI whose memory is only 11g. Can you provide some details about the settings of the experiment?
hi, can you tell me what's the traning_mask?
Dear Author:
Thanks for the release of code!
However, I'm a little confused about the pretraining on ICDAR2017, which is mentioned in the list of results in github. It is a little different from the setting in the paper in which ICDAR2017 and ICDAR2015 are mixed as a whole training data. Can you provide more training details(epoch, lr, lr_scheluder and so on) about pretaining precess of ICDAR2017? Thanks!
Best Wishes!
不是特别理解,还请大佬指导,明明训练的时候根本没这么大的图- -
resnet50 1s no pre-train,author provide weight ic_2015
Calculated!{"recall": 0.7881559942224362, "precision": 0.8309644670050761, "hmean": 0.80899431677786, "AP": 0}
in ICPR MTWI 2018 Challenge 2, your result is F = 75.2, which do you select as backbone, resnet50,resnet101 or resnet152?
Hi, I want to know the output score map is 1x1x1 with n times or 1x1xn , look forward to your reply.
when train with option --arch='resnet101',error raised
File "/root/data/workspace/PSENet/models/fpn_resnet.py", line 477, in resnet101
pretrained_model = model_zoo.load_url(model_urls['resnet101'])
File "/usr/local/lib/python2.7/dist-packages/torch/utils/model_zoo.py", line 65, in load_url
hash_prefix = HASH_REGEX.search(filename).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Is the link invalid?
please add a proper license file so the code can be reused legally. MIT, Apache 2 or 3-clause BSD seem to be the most popular choices.
(also you added code from wkentaro/pytorch-fcn which is MIT - so maybe MIT it is?)
Could you tell me the detail of your result in icdar2017?
作者你好,请问能不能提供Label Generation的代码?
When to upload code
Hello sir @whai362
This repository is 5 months old and even without any source code inside it still get tons of stars.
So when will you update the source code ?
in test_ic15.py
line 143,scale = (org_img.shape[0] * 1.0 / pred.shape[0], org_img.shape[1] * 1.0 / pred.shape[1])
it maybe wrong?
shape[0]->h
shape[1]->w
so, I think it is
scale = (org_img.shape[1] * 1.0 / pred.shape[1], org_img.shape[0] * 1.0 / pred.shape[0]) @whai362
PSENet-1s (ResNet50) 和 4s 放的模型连接是一样的,怎么选择输出的kernel大小呢
运行test_ic15.py报错如下
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
但我的cuda是9.2,为什么它去找cuda8呢?望大牛帮助求解 @whai362
当我在编译 pse相关文件的时候遇到了这个报错
adaptor.cpp:35:86: error: ‘connectedComponents’ was not declared in this scope
我的环境中的opencv 分别是pip 安装的opencv-python 版本为4.0.0 和 4.1.0
gcc g++ 版本为5.4
请问这个问题是怎么回事呢?
是否需要安装低版本的opencv 或者是需要 从源码编译opencv
@whai362
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.