Git Product home page Git Product logo

vibertgrid-pytorch's People

Contributors

dependabot[bot] avatar zeninglin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

vibertgrid-pytorch's Issues

Validation in CRF mode

Hi, thanks for your effort.

I noticed a problem in the pipeline.train_val_utils.validate when we are running on crf classifier mode such that the inference function the crf.py returns the predicted target sequence (which is actually predicted class ids), not the probabilities for each class. Then, pipeline.train_val_utils.validate function, we are getting the predicted class ids by running torch.argmax. However, as I said, we do not have the probailities here, instead we directly have the predicted class ids.

I think we can solve this by adding a is_crf argument in the pipeline.train_val_utils.validate function and an if block decides to apply the argmax or not.

Also, irrelevant but there is a type here

Thanks, sincerely.

I need help about customize entities of SROIE dataset

Hello, firstly thank your for support in advance.

I would like to expand SROIE entities by using my own dataset. is it possible? Example: I would like to change as following array

SROIE_CLASS_LIST = ["others", "company", "date", "address", "total"]

SROIE_CLASS_LIST = ["others", "company", "date", "time", "address", "total", "tax", "sub_total"] etc...

About SROIE annotations

Hi,

You mentioned that you have annotated SROIE dataset to be able to use it effectively with ViBERTgrid. While annotating, what did you do with multiple occurring tokens? For example date label, there are receipts in which there are multiple occurrences of the same date. Have you annotated all of them as date or only one? Thanks.

Model Training.

Sir, I was trying to run your code and found that weights are not being saved, I found that you have used a very high values of F1_score = 0.95. Can you please explain the reason of keeping such high value. Because during our testing we were getting the maximum F1 value of 0.4328 during validation. We used the SROIE dataset.
Also the maximum epochs were 33 then why have you used the condition of (epoch % 400 == 0) in file train_SROIE.py, line 364.
Thanks

No predictions in inference.

I have trained the CORD dataset as per the "example.yaml" file. F1 scores seem to be excellent (with the CRF network).
But when I was trying to create the predictions, It was not predicting anything.
Can you provide an example of OCR API? Currently, I am using a custom Paddleocr flask server to get the OCR results and I convert the outputs to the required format that you have mentioned in the script.

If possible please share the OCR script. or the exact format that the module needs.

FUNSD dataset - empty key_dict

Hi,

first of all thanks for your great effort here!

I currently struggle to start a training on the FUNSD dataset. During the evaluation, I get the following error message:
Traceback (most recent call last):
File "/home/rrutmann/pycharm_projects/ViGPTgrid/train_FUNSD.py", line 422, in
train(args)
File "/home/rrutmann/pycharm_projects/ViGPTgrid/train_FUNSD.py", line 306, in train
F1 = validate(
File "/home/rrutmann/miniconda3/envs/vigptgrid/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
Traceback (most recent call last):
File "/home/rrutmann/pycharm_projects/ViGPTgrid/train_FUNSD.py", line 422, in
return func(*args, **kwargs)
File "/home/rrutmann/pycharm_projects/ViGPTgrid/pipeline/train_val_utils.py", line 506, in validate
curr_gt_str = key_dict[0][curr_class_name]train(args)

File "/home/rrutmann/pycharm_projects/ViGPTgrid/train_FUNSD.py", line 306, in train
TypeError: 'NoneType' object is not subscriptable
F1 = validate(
File "/home/rrutmann/miniconda3/envs/vigptgrid/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/rrutmann/pycharm_projects/ViGPTgrid/pipeline/train_val_utils.py", line 506, in validate
curr_gt_str = key_dict[0][curr_class_name]
TypeError: 'NoneType' object is not subscriptable

Indeed, key_dict is None. In the EPHOIE_DATASET and FUNSD_DATASET classes the variable full_key_dict is defined, which I assume is then later used for key_dict. But this part is missing in the FUNSD_DATASET class.

Could you please tell me how to handle this error?

SROIE dataset issues.

I am trying to reproduce the original paper results on the SROIE dataset and have some doubts regarding the same.

  • Please share the your method on re-labeling the dataset with coordinates or your relabeled dataset.
  • Also please tell the input format of train and test labels ( file type, order).
  • In pipeline\sroie_data_preprocessing.py what are dir_test_root and dir_processed ? If I have to run code on my devise which one should I replace with the path of data in my device ?
    Please help in these issues.

Configure sroie_data_preprocessing.py for expand CLASS_LIST

Hello again. I need a help about expand CLASS_LIST. Firstly thank you for your support in advance

I have configured
SROIE_CLASS_LIST = ["others", "company", "address", "document_number", "date_time", "total", "tax"]

Sample box file is as following

182,70,435,70,435,110,182,110,BURGER KING,company
97,112,512,112,512,155,97,155,EKUR İNŞAAT SANAYİ VE TİCARET A.Ş.,other
42,152,570,152,570,200,42,200,MEVLANA MH.Ç.MEHMET CD. NO:33/A MARMARAPARK,address
70,194,544,194,544,242,70,242,AVM. 3F02 ESENLER/İST. TİC. SİC. NO:300241,address
95,238,522,242,522,291,94,287,BOĞAZİÇİ KURUMLAR V.D.330 005 3911,other
44,312,177,312,177,360,44,360,13/05/2024,date_time
390,315,570,315,570,362,390,362,FİŞ NO: 000132,document_number
47,360,192,360,192,407,47,407,SAAT: 21:17,date_time
60,435,265,435,265,482,60,482,1 2TB+K.IC+K.PAT,other
307,440,350,440,350,482,307,482,%10,other
447,437,542,437,542,482,447,482,*119,99,other
87,482,242,482,242,527,87,527,2 TAVUKBRGER,other
472,485,540,485,540,527,472,527,*0,00,other
307,487,350,487,350,527,307,527,%10,other
115,530,347,530,347,575,115,575,1 + Peynir Ekle %10,other
460,530,537,530,537,572,460,572,*10,00,other
470,575,535,575,535,617,470,617,*0,00,other
142,577,347,577,347,620,142,620,+ DomatesEkle %10,other
120,618,278,623,277,667,118,662,1 + TursuEkle,other
305,620,347,620,347,662,305,662,%10,other
467,620,532,620,532,660,467,660,*0,00,other
467,662,530,662,530,702,467,702,*0,00,other
120,665,347,665,347,707,120,707,1 + Sogan Ekle %10,other
97,705,245,705,245,747,97,747,1 KUCUKAYRAN,other
465,705,530,705,530,745,465,745,*0,00,other
305,707,347,707,347,745,305,745,%10,other
465,745,527,745,527,787,465,787,*9,00,other
97,747,232,747,232,790,97,790,1 O.PATATES,other
305,750,345,750,345,787,305,787,%10,other
462,787,527,787,527,827,462,827,*0,00,other
100,790,200,790,200,832,100,832,1 KETCAP,other
305,790,345,790,345,827,305,827,%10,other
462,827,525,827,525,867,462,867,*0,00,other
100,832,210,832,210,872,100,872,1 MAYONEZ,other
305,832,345,832,345,870,305,870,%10,other
102,870,245,870,245,912,102,912,1 ISTENMIYOR,other
460,870,522,870,522,910,460,910,*0,00,other
305,872,345,872,345,910,305,910,%10,other
460,910,522,910,522,947,460,947,*0,00,other
102,912,245,912,245,952,102,952,1 ISTENMIYOR,other
302,912,345,912,345,950,302,950,%10,other
447,987,520,987,520,1027,447,1027,*12,64,tax
72,990,210,990,210,1030,72,1030,TOPKDV,other
435,1025,517,1025,517,1065,435,1065,*138,99,total
75,1027,209,1027,209,1070,75,1070,TOPLAM,other
432,1102,517,1102,517,1142,432,1142,*138,99,other
75,1107,137,1107,137,1145,75,1145,NAKİT,other
75,1145,290,1145,290,1182,75,1182,POS:3 RefNum:30122,other
119,1204,487,1200,487,1242,120,1246,Sipariş Numarası:,other
250,1242,342,1242,342,1282,250,1282,3122,other
74,1307,234,1304,235,1346,75,1349,Kasiyer: 25620,other
74,1352,537,1347,537,1379,75,1385,*********************************************************************************************************************************,other
77,1387,504,1385,505,1427,77,1430,Asagidaki web sitesinde anket doldurun.,other
75,1427,375,1427,375,1467,75,1467,King boy secim bedava alin.,other
77,1470,367,1470,367,1507,77,1507,www.burgerkingdeneyimi.com,other
77,1508,334,1505,335,1547,77,1550,Sifre: 2182851100391240,other
77,1550,245,1550,245,1590,77,1590,Doğrulama Kodu:,other
77,1589,477,1589,477,1629,77,1629,Sifre ve dogrulama kodu alindigindan,other
79,1631,504,1626,505,1668,80,1673,itibaren 15 gun icinde kullanilmalidir,other
77,1672,477,1668,477,1709,77,1712,Sartlar ve icerik web sayfasindadir.,other
79,1717,544,1709,545,1739,80,1747,*********************************************************************************************************************************,other
80,1770,272,1770,272,1807,80,1807,KASİYER:KASİYER 2,other
417,1820,542,1820,542,1852,417,1852,EKÜ NO:0003,other
84,1826,209,1823,210,1857,85,1860,Z NO:001532,other
209,1879,389,1879,389,1904,209,1904,NF 3E 20040058,other

Sample key file is as following

{
    "company": "BURGER KING",
    "address": "MEVLANA MH.Ç.MEHMET CD. NO:33/A MARMARAPARK AVM. 3F02 ESENLER/İST. TİC. SİC. NO:300241",
    "document_number": "FİŞ NO: 000132",
    "date_time": "13/05/2024 SAAT: 21:17",
    "total": "*138,99",
    "tax": "*12,64"
}

According to above data how can I modify is following code.

And I do not want to use regex for fix data pattern. I would like to modify like just raw text

 total_float = re.search(r"([-+]?[0-9]*\.?[0-9]+)", key_info["total"])
    for index, row in gt_dataframe.iterrows():
        # default value
        gt_dataframe.loc[index, "pos_neg"] = 2

        # retrieve 'company' in gt_dataframe
        if (
            cosine_simularity(
                count_vectorizer[0].reshape(1, -1),
                count_vectorizer[index + len(data_classes)].reshape(1, -1),
            )
            > cosine_sim_treshold
        ):
            gt_dataframe.loc[index, "data_class"] = 1
            gt_dataframe.loc[index, "pos_neg"] = 1

        # retrieve 'address' in gt_dataframe
        if (
            cosine_simularity(
                count_vectorizer[2].reshape(1, -1),
                count_vectorizer[index + len(data_classes)].reshape(1, -1),
            )
            > cosine_sim_treshold
        ):
            gt_dataframe.loc[index, "data_class"] = 3
            gt_dataframe.loc[index, "pos_neg"] = 1

        # retrieve 'date' in gt_dataframe
        tab_date = re.findall(
            date_regex,
            row["text"],
        )
        for date in tab_date:
            if date[0] == key_info["date"]:
                gt_dataframe.loc[index, "data_class"] = 2
                gt_dataframe.loc[index, "pos_neg"] = 1

        # retrieve 'total' in gt_dataframe
        tab_floats = re.findall(r"([-+]?[0-9]*\.?[0-9]+)", row["text"])
        if total_float:
            for float_ in tab_floats:
                if float(total_float.group(0)) == float(float_):
                    gt_dataframe.loc[index, "data_class"] = 4
                    gt_dataframe.loc[index, "pos_neg"] = 1

 return gt_dataframe, image_shape

Training on custom dataset

Hi, thanks for your work and contribution. I have studied on Chargrid and especially BERTgrid. Now, I am trying to understand the ViBERTgrid paper. Could you explain how can I use your implementation on my own custom dataset? Are train scripts generalizeable to custom datasets?
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.