anisha2102 / docvqa Goto Github PK

View Code? Open in Web Editor NEW

105.0 105.0 24.0 149 KB

Document Visual Question Answering

License: MIT License

Python 93.13% Jupyter Notebook 6.87%

computer-vision deep-learning document-analysis visual-question-answering

docvqa's People

Contributors

Stargazers

Watchers

docvqa's Issues

How to run it on the test sample image.

Using document chunks with no answer for training

Hi!

In the utils_docvqa.py script the convert_examples_to_features function is responsible for generating the features for training. I see you use a sliding window approach to cut documents that are longer than the maximum sequence length, so you end up doc parts with no answer in them.

Why do you include those doc spans with no answer into the feature set at the end?
utils_docvqa.py line 308-327: here you state in the comment that you throw those docs out:

      if is_training and not example.is_impossible:
        # For training, if our document chunk does not contain an annotation
        # we throw it out, since there is nothing to predict.

but then you just set the start and end positions to 0, and include the feature into the final feature set. Is there a reason for that?
Why not just skip those examples and not include them into the feature set?

Thanks for the answer!

Question in blogpost

Hi, thank you for opening great code.
I have a question from your blog post.

As you mentioned blogpost, before you fine-tune the language model (layoutlm) by docvqa dataset, you pre-trained the layoutlm using the squad dataset.

In the training phase, how to set up the 2d positional encoding? as far as I know, there are no 2d positional information in the squad dataset.

Thank you :)

Details of install requirements

Hi,
Thank you for opening your code.
I want to know the details of the version about the install requirements.
There are some collisions between modules.

Could you provide me a list of 'conda list'?

Thank you.

AttributeError: 'DocvqaExample' object has no attribute 'answers'

I encountered an error.

Traceback (most recent call last):
File "run_docvqa.py", line 850, in
main()
File "run_docvqa.py", line 744, in main
args, train_dataset, model, tokenizer, labels, pad_token_label_id
File "run_docvqa.py", line 251, in train
mode="dev",
File "run_docvqa.py", line 366, in evaluate
results = squad_evaluate(examples, predictions)
File "my_path/squad_metrics.py", line 212, in squad_evaluate
qas_id_to_has_answer = {example.qas_id: bool(example.answers) for example in examples}
File "my_path/squad_metrics.py", line 212, in
qas_id_to_has_answer = {example.qas_id: bool(example.answers) for example in examples}
AttributeError: 'DocvqaExample' object has no attribute 'answers'

class DocvqaExample(object):
    """A single training/test example for token classification."""


    def __init__(self,
               qas_id,
               question_text,
               doc_tokens,
               orig_answer_text=None,
               start_position=None,
               end_position=None,
               is_impossible=False,
               boxes = []):
        self.qas_id = qas_id
        self.question_text = question_text
        self.doc_tokens = doc_tokens
        self.orig_answer_text = orig_answer_text
        self.start_position = start_position
        self.end_position = end_position
        self.is_impossible = is_impossible
        self.boxes = boxes

I checked that class DocvqaExample has no attribute 'answers'.

def squad_evaluate(examples, preds, no_answer_probs=None, no_answer_probability_threshold=1.0):
    qas_id_to_has_answer = {example.qas_id: bool(example.answers) for example in examples}
    has_answer_qids = [qas_id for qas_id, has_answer in qas_id_to_has_answer.items() if has_answer]
    no_answer_qids = [qas_id for qas_id, has_answer in qas_id_to_has_answer.items() if not has_answer]

    if no_answer_probs is None:
        no_answer_probs = {k: 0.0 for k in preds}

    exact, f1 = get_raw_scores(examples, preds)

    exact_threshold = apply_no_ans_threshold(
        exact, no_answer_probs, qas_id_to_has_answer, no_answer_probability_threshold
    )
    f1_threshold = apply_no_ans_threshold(f1, no_answer_probs, qas_id_to_has_answer, no_answer_probability_threshold)

    evaluation = make_eval_dict(exact_threshold, f1_threshold)

    if has_answer_qids:
        has_ans_eval = make_eval_dict(exact_threshold, f1_threshold, qid_list=has_answer_qids)
        merge_eval(evaluation, has_ans_eval, "HasAns")

    if no_answer_qids:
        no_ans_eval = make_eval_dict(exact_threshold, f1_threshold, qid_list=no_answer_qids)
        merge_eval(evaluation, no_ans_eval, "NoAns")

    if no_answer_probs:
        find_all_best_thresh(evaluation, preds, exact, f1, no_answer_probs, qas_id_to_has_answer)

    return evaluation

My transformers version is 2.8.0.
What should I do?
Please, anyone give me some help.

How to create sample_data.json?

Firstly thanks for making this open-source! I was looking through the example and was wondering how you get the sample_data.json file, as the DocVQA dataset task 1's ocr results .json files look very different. Thanks!

your model from google drive link seems corrupted

can you please upload the model for demo please.
ValueError: The state dictionary of the model you are training to load is corrupted. Are you sure it was properly saved?

I guess model file is corrupted.

the f1 and EM score too low when eval

Hi,thanks for your code,when I use my trained moel on the val dataset,the exact match score is lower than 3 and the f1 score is lower than 10,is this normal and how can I improve the score?

Usage

I notice the code is in Tensorflow but the Google Drive link is a PyTorch model. Will there be a TF Model in the future?
Is the PyTorch model a trained model?

question about performance in DocAVQ

I use your code, the performance on the DocAVQ dataset can only reach 49. Can you provide your trained model？

Not able to post process the model output.

I am not able to process the output. Please refer to the error log.

Traceback (most recent call last)
Input In [12], in <cell line: 7>()
     23 eval_feature = features[example_index.item()]
     24 unique_id = int(eval_feature.unique_id)
---> 26 output = [to_list(output[i]) for output in outputs]
     27 print("type: ",output)
     28 start_logits, end_logits = output

Input In [12], in <listcomp>(.0)
     23 eval_feature = features[example_index.item()]
     24 unique_id = int(eval_feature.unique_id)
---> 26 output = [to_list(output[i]) for output in outputs]
     27 print("type: ",output)
     28 start_logits, end_logits = output

Input In [12], in to_list(tensor)
      4 def to_list(tensor):
----> 5     return tensor.detach().cpu().tolist()

AttributeError: 'str' object has no attribute 'detach'

anisha2102 / docvqa Goto Github PK

docvqa's People

Contributors

Stargazers

Watchers

Forkers

docvqa's Issues

Recommend Projects

Recommend Topics

Recommend Org