Git Product home page Git Product logo

Comments (38)

liuch37 avatar liuch37 commented on June 29, 2024

Yes, please check draw_result function in inference.py.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thanks for replying ,i check the draw_result function and i found that it only support drawing the detecting boxes on the image,but i still can't show what the text really are?i though it is an end-to-end model

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

No, PAN is only for text detection, not including recognition.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 Got it ,really thanks for repling!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 if i wanna to do the recognition at the same time ,(second stage is acceptable too),could u recommend me a model?lol

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

You can also check my second stage text rec ognition repository - https://github.com/liuch37/sar-pytorch. Or you can check this end-to-end single detection+recognition model https://github.com/Yuliang-Liu/bezier_curve_text_spotting.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 your reply is really a great help for me ,really thanks for your patience!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 Hi! Sorry to bother you again.I face some problem when i use PANnet to train my own dataset.I make my data to suit IC15 type ,but not all my label is rectangle,some of them are arbitrary polygon , i wonder how to train those special labels and is PANnet suitable for arbitrary polygon?Thanks again!

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes PAN supports arbitrary polygon training. You can modify the data generator in dataset/ic15.py. For example something like:

def get_ann(img, gt_path)
...
    for line in lines:
        ...
        word = gt[-1].replace('\r', '').replace('\n', '')
        ...
        bbox = [int(gt[i]) for i in range(len(gt) - 1)]
        bbox = np.array(bbox) / ([w * 1.0, h * 1.0] * (len(bbox) // 2))

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 It's really a great news for me.Thanks again.Sorry to bother that much , it's a little bit hard for a new guy,lol

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

image

@liuch37 sometimes i will face the problems with reshape() in dataset/ic15.py (line 347 :),i cover the code with totaltext type in totext.py, but it doesn't work,sorry to bother u again, if you are busy just ignore the questions haha.I will try my best to figure it out.Thanks a lot

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

I think that there may be problems when the labels have 8points 14points or sth at the same times

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes you are right. You need to handle mixed number of points for each polygon. Since we can not process it in one batch, you can try something like the following

        if bboxes.shape[0] > 0:
            #bboxes = np.reshape(bboxes * ([img.shape[1], img.shape[0]] * 4),
            #                    (bboxes.shape[0], -1, 2)).astype('int32')
            for i in range(bboxes.shape[0]):
                bbox = ((bboxes[i] * ((img.shape[1], img.shape[0]) * (len(bboxes[i]) // 2))).astype('int32')).reshape(-1, 2)
                cv2.drawContours(gt_instance, [bbox], -1, i + 1, -1)

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thanks again, i will try it ,really thanks for your patience

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 I think just to fit the output shape like the origin one did like (xx,8)->(xx,4,2)

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

image
i may change the code in a wrong way,lol

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

image
I use a new list to save the bbox one by one ,cause it has different shape ,we cannot stack it, only save it in list and enmurate it next

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

I have try to run ic15.py to visualize.Is it normal that i got nothing to show in training_masks? others seem to be normal.

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes. Training masks are usually all 1s, except you have a word being masked out like below. Then masked text region will be set to 0.

if words[i] == '###':
    cv2.drawContours(training_mask, [bboxes[i]], -1, 0, -1)

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 Seems like it works lol.Thanks again for your patience!!!!!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

image
Is it a warnning that we can ignore it?Maybe i means the area is overload?

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

You can check ic15.py, if a polygon shrink fails, the exception will output the polygon area and perimeter and use the original polygon as shrinked bbox. In general it will still work, but if shrink function does not work for most polygons, I guess the performance will not be that good.

except Exception as e:
    print('area:', area, 'peri:', peri)
    shrinked_bboxes.append(bbox)

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thanks again!!sorry to bother you that much !You are so nice!!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 sry to bother u again! i found that you had used resnet18 as pannet's backbone ,if i wanna to have a better result should i change it into resnet101?

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes, that is certainly one way.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thanks,may be i should rewrite the input channels, it fits resnet18 but not resnet101 ,thanks again!

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Replace this neck_channel = (256, 512, 1024, 2048) in your train.py. This setting is for both ResNet50 and ResNet101.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thank you so much!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 It's me again!haha,sry to bother you. I had faced some problems recently, either detection part or recognition part i try so many tricks but both don't work.I think detection part is good enough cause i can see the boundingboxes which works nicely.So i wonder if you can recommend some good models of text recognition to me. Thanks you so much for your help recently.

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

You can try the popular CRNN (https://github.com/meijieru/crnn.pytorch).

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 thanks ,i will go have a try

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 hi, have you seen the model called “PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text", i think it would help if i use the detect head in pan++ replace the one in your origin code, but i am not sure if my idea is in right direction, i hope you can give me some advices. thanks a lot.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 But i think the pan++ had just add a recognize head without changing anything in the detection part

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes you are right. You can either use their official github or try to integrate their recognition head into this repo.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 haha,thanks you so much!!

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 hi, it's me again haha,lately i had some ideas but i don't know if it may work, so i wanna to ask you for some advice if you are not busy.I saw some network called GAN ,it said that we could use it to creat new data based on the origin one,so that we will have more data to train our net,have you heard something like that?

from pan-pytorch.

liuch37 avatar liuch37 commented on June 29, 2024

Yes, using GAN to generate synthetic data has been proven to be a promising approach. The inventor of GAN, Ian Goodfellow, who is working at Apple now, I have attended his talk previously. He mentioned their team can boost computer vision task performance largely by using these kind of GAN generated synthetic data. But before you try to go into this route, I would recommend to synthesize from real images (or something close to your target dataset distribution) for your OCR task, if possible.

from pan-pytorch.

summer-1010 avatar summer-1010 commented on June 29, 2024

@liuch37 Thank you for giving me such good advice. I think you are my best teacher on visual tasks. Thank you for your recent advice.Hope you have a nice day.

from pan-pytorch.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.