Comments (38)
Yes, please check draw_result
function in inference.py
.
from pan-pytorch.
@liuch37 thanks for replying ,i check the draw_result function and i found that it only support drawing the detecting boxes on the image,but i still can't show what the text really are?i though it is an end-to-end model
from pan-pytorch.
No, PAN is only for text detection, not including recognition.
from pan-pytorch.
@liuch37 Got it ,really thanks for repling!
from pan-pytorch.
@liuch37 if i wanna to do the recognition at the same time ,(second stage is acceptable too),could u recommend me a model?lol
from pan-pytorch.
You can also check my second stage text rec ognition repository - https://github.com/liuch37/sar-pytorch. Or you can check this end-to-end single detection+recognition model https://github.com/Yuliang-Liu/bezier_curve_text_spotting.
from pan-pytorch.
@liuch37 your reply is really a great help for me ,really thanks for your patience!
from pan-pytorch.
@liuch37 Hi! Sorry to bother you again.I face some problem when i use PANnet to train my own dataset.I make my data to suit IC15 type ,but not all my label is rectangle,some of them are arbitrary polygon , i wonder how to train those special labels and is PANnet suitable for arbitrary polygon?Thanks again!
from pan-pytorch.
Yes PAN supports arbitrary polygon training. You can modify the data generator in dataset/ic15.py
. For example something like:
def get_ann(img, gt_path)
...
for line in lines:
...
word = gt[-1].replace('\r', '').replace('\n', '')
...
bbox = [int(gt[i]) for i in range(len(gt) - 1)]
bbox = np.array(bbox) / ([w * 1.0, h * 1.0] * (len(bbox) // 2))
from pan-pytorch.
@liuch37 It's really a great news for me.Thanks again.Sorry to bother that much , it's a little bit hard for a new guy,lol
from pan-pytorch.
@liuch37 sometimes i will face the problems with reshape() in dataset/ic15.py (line 347 :),i cover the code with totaltext type in totext.py, but it doesn't work,sorry to bother u again, if you are busy just ignore the questions haha.I will try my best to figure it out.Thanks a lot
from pan-pytorch.
I think that there may be problems when the labels have 8points 14points or sth at the same times
from pan-pytorch.
Yes you are right. You need to handle mixed number of points for each polygon. Since we can not process it in one batch, you can try something like the following
if bboxes.shape[0] > 0:
#bboxes = np.reshape(bboxes * ([img.shape[1], img.shape[0]] * 4),
# (bboxes.shape[0], -1, 2)).astype('int32')
for i in range(bboxes.shape[0]):
bbox = ((bboxes[i] * ((img.shape[1], img.shape[0]) * (len(bboxes[i]) // 2))).astype('int32')).reshape(-1, 2)
cv2.drawContours(gt_instance, [bbox], -1, i + 1, -1)
from pan-pytorch.
@liuch37 thanks again, i will try it ,really thanks for your patience
from pan-pytorch.
@liuch37 I think just to fit the output shape like the origin one did like (xx,8)->(xx,4,2)
from pan-pytorch.
i may change the code in a wrong way,lol
from pan-pytorch.
I use a new list to save the bbox one by one ,cause it has different shape ,we cannot stack it, only save it in list and enmurate it next
from pan-pytorch.
I have try to run ic15.py to visualize.Is it normal that i got nothing to show in training_masks? others seem to be normal.
from pan-pytorch.
Yes. Training masks are usually all 1s, except you have a word being masked out like below. Then masked text region will be set to 0.
if words[i] == '###':
cv2.drawContours(training_mask, [bboxes[i]], -1, 0, -1)
from pan-pytorch.
@liuch37 Seems like it works lol.Thanks again for your patience!!!!!
from pan-pytorch.
Is it a warnning that we can ignore it?Maybe i means the area is overload?
from pan-pytorch.
You can check ic15.py
, if a polygon shrink fails, the exception will output the polygon area and perimeter and use the original polygon as shrinked bbox. In general it will still work, but if shrink function does not work for most polygons, I guess the performance will not be that good.
except Exception as e:
print('area:', area, 'peri:', peri)
shrinked_bboxes.append(bbox)
from pan-pytorch.
@liuch37 thanks again!!sorry to bother you that much !You are so nice!!
from pan-pytorch.
@liuch37 sry to bother u again! i found that you had used resnet18 as pannet's backbone ,if i wanna to have a better result should i change it into resnet101?
from pan-pytorch.
Yes, that is certainly one way.
from pan-pytorch.
@liuch37 thanks,may be i should rewrite the input channels, it fits resnet18 but not resnet101 ,thanks again!
from pan-pytorch.
Replace this neck_channel = (256, 512, 1024, 2048) in your train.py. This setting is for both ResNet50 and ResNet101.
from pan-pytorch.
@liuch37 thank you so much!
from pan-pytorch.
@liuch37 It's me again!haha,sry to bother you. I had faced some problems recently, either detection part or recognition part i try so many tricks but both don't work.I think detection part is good enough cause i can see the boundingboxes which works nicely.So i wonder if you can recommend some good models of text recognition to me. Thanks you so much for your help recently.
from pan-pytorch.
You can try the popular CRNN (https://github.com/meijieru/crnn.pytorch).
from pan-pytorch.
@liuch37 thanks ,i will go have a try
from pan-pytorch.
@liuch37 hi, have you seen the model called “PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text", i think it would help if i use the detect head in pan++ replace the one in your origin code, but i am not sure if my idea is in right direction, i hope you can give me some advices. thanks a lot.
from pan-pytorch.
@liuch37 But i think the pan++ had just add a recognize head without changing anything in the detection part
from pan-pytorch.
Yes you are right. You can either use their official github or try to integrate their recognition head into this repo.
from pan-pytorch.
@liuch37 haha,thanks you so much!!
from pan-pytorch.
@liuch37 hi, it's me again haha,lately i had some ideas but i don't know if it may work, so i wanna to ask you for some advice if you are not busy.I saw some network called GAN ,it said that we could use it to creat new data based on the origin one,so that we will have more data to train our net,have you heard something like that?
from pan-pytorch.
Yes, using GAN to generate synthetic data has been proven to be a promising approach. The inventor of GAN, Ian Goodfellow, who is working at Apple now, I have attended his talk previously. He mentioned their team can boost computer vision task performance largely by using these kind of GAN generated synthetic data. But before you try to go into this route, I would recommend to synthesize from real images (or something close to your target dataset distribution) for your OCR task, if possible.
from pan-pytorch.
@liuch37 Thank you for giving me such good advice. I think you are my best teacher on visual tasks. Thank you for your recent advice.Hope you have a nice day.
from pan-pytorch.
Related Issues (19)
- Data~ HOT 1
- inference.py error HOT 2
- Quantization, Pruning and distillation
- SROIE dataset error HOT 6
- ctw1500 pretrained model HOT 1
- How to evaluate the model? HOT 13
- Training fails when adding labels with 14 points HOT 3
- Inference.py Error HOT 3
- Test result and Visualization
- Inference.py
- pre trained model HOT 1
- 可以提供一下已经训练好的模型的链接吗 HOT 1
- How to evaluate the model? HOT 1
- Backbone Resnet101 - Training time error HOT 3
- invalid argument - issue HOT 1
- Tensorboard - not showing the graph HOT 1
- Training restart HOT 1
- Resnet 152 - Loss is not reducing HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pan-pytorch.