Hi! I ran the mGENRE example in the readme <div class="snippet-c

Sure. <div class="snippet-clipboard-content notranslate position-relative overflow

当然。 <div class="snippet-clipboard-content notranslate position-relati

NameError: name 'batched_hypos' is not defined (mGENRE) about genre HOT 5 OPEN

mrpeerat commented on July 18, 2024

NameError: name 'batched_hypos' is not defined (mGENRE)

from genre.

Comments (5)

nicola-decao commented on July 18, 2024 1

Can you post the full error stack?

from genre.

mrpeerat commented on July 18, 2024

Sure.

2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | can_reuse_epoch_itr = False
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | reuse_dataloader = True
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | rebuild_batches = False
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | creating new batches for epoch 1
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In [137], line 1
----> 1 model.sample(
      2     sentences=["[START] Einstein [END] era un fisico tedesco."],
      3     # Italian for "[START] Einstein [END] was a German physicist."
      4     prefix_allowed_tokens_fn=lambda batch_id, sent: [
      5         e for e in trie.get(sent.tolist()) if e < len(model.task.target_dictionary)
      6     ],
      7     text_to_id=lambda x: max(lang_title2wikidataID[
      8         tuple(reversed(x.split(" >> ")))
      9     ], key=lambda y: int(y[1:])),
     10     marginalize=True,
     11 )

File ~/GENRE/genre/fairseq_model.py:53, in _GENREHubInterface.sample(self, sentences, beam, verbose, text_to_id, marginalize, marginalize_lenpen, max_len_a, max_len_b, **kwargs)
     36 batched_hypos = self.generate(
     37     tokenized_sentences,
     38     beam,
   (...)
     42     **kwargs,
     43 )
     45 outputs = [
     46     [
     47         {"text": self.decode(hypo["tokens"]), "score": hypo["score"]}
   (...)
     50     for hypos in batched_hypos
     51 ]
---> 53 outputs = post_process_wikidata(
     54     outputs, text_to_id=text_to_id, marginalize=marginalize
     55 )
     57 return outputs

File ~/GENRE/genre/utils.py:492, in post_process_wikidata(outputs, text_to_id, marginalize)
    486 outputs = [
    487     [{**hypo, "id": text_to_id(hypo["text"])} for hypo in hypos]
    488     for hypos in outputs
    489 ]
    491 if marginalize:
--> 492     for (i, hypos), hypos_tok in zip(enumerate(outputs), batched_hypos):
    493         outputs_dict = defaultdict(list)
    494         for hypo, hypo_tok in zip(hypos, hypos_tok):

NameError: name 'batched_hypos' is not defined

from genre.

wanyanbin1998y commented on July 18, 2024

当然。

2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | can_reuse_epoch_itr = False
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | reuse_dataloader = True
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | rebuild_batches = False
2023-02-03 09:13:50 | INFO | fairseq.tasks.fairseq_task | creating new batches for epoch 1
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In [137], line 1
----> 1 model.sample(
      2     sentences=["[START] Einstein [END] era un fisico tedesco."],
      3     # Italian for "[START] Einstein [END] was a German physicist."
      4     prefix_allowed_tokens_fn=lambda batch_id, sent: [
      5         e for e in trie.get(sent.tolist()) if e < len(model.task.target_dictionary)
      6     ],
      7     text_to_id=lambda x: max(lang_title2wikidataID[
      8         tuple(reversed(x.split(" >> ")))
      9     ], key=lambda y: int(y[1:])),
     10     marginalize=True,
     11 )

File ~/GENRE/genre/fairseq_model.py:53, in _GENREHubInterface.sample(self, sentences, beam, verbose, text_to_id, marginalize, marginalize_lenpen, max_len_a, max_len_b, **kwargs)
     36 batched_hypos = self.generate(
     37     tokenized_sentences,
     38     beam,
   (...)
     42     **kwargs,
     43 )
     45 outputs = [
     46     [
     47         {"text": self.decode(hypo["tokens"]), "score": hypo["score"]}
   (...)
     50     for hypos in batched_hypos
     51 ]
---> 53 outputs = post_process_wikidata(
     54     outputs, text_to_id=text_to_id, marginalize=marginalize
     55 )
     57 return outputs

File ~/GENRE/genre/utils.py:492, in post_process_wikidata(outputs, text_to_id, marginalize)
    486 outputs = [
    487     [{**hypo, "id": text_to_id(hypo["text"])} for hypo in hypos]
    488     for hypos in outputs
    489 ]
    491 if marginalize:
--> 492     for (i, hypos), hypos_tok in zip(enumerate(outputs), batched_hypos):
    493         outputs_dict = defaultdict(list)
    494         for hypo, hypo_tok in zip(hypos, hypos_tok):

NameError: name 'batched_hypos' is not defined

Has the problem been solved? How did you solve it?

from genre.

highly0 commented on July 18, 2024

Same issues. Any update?

from genre.

EmanuelaBoros commented on July 18, 2024

The solution is to modify this method to receive batched_hypos:

def post_process_wikidata(outputs, text_to_id=False, marginalize=False,
                          batched_hypos=None, marginalize_lenpen=0.5):

    if text_to_id:
        outputs = [
            [{**hypo, "id": text_to_id(hypo["text"])} for hypo in hypos]
            for hypos in outputs
        ]

        if marginalize:
            for (
                    i, hypos), hypos_tok in zip(
                    enumerate(outputs), batched_hypos):
                outputs_dict = defaultdict(list)
                for hypo, hypo_tok in zip(hypos, hypos_tok):
                    outputs_dict[hypo["id"]].append(
                        {**hypo, "len": len(hypo_tok["tokens"])}
                    )

                outputs[i] = sorted(
                    [
                        {
                            "id": _id,
                            "texts": [hypo["text"] for hypo in hypos],
                            "scores": torch.stack([hypo["score"] for hypo in hypos]),
                            "score": torch.stack(
                                [
                                    hypo["score"]
                                    * hypo["len"]
                                    / (hypo["len"] ** marginalize_lenpen)
                                    for hypo in hypos
                                ]
                            ).logsumexp(-1),
                        }
                        for _id, hypos in outputs_dict.items()
                    ],
                    key=lambda x: x["score"],
                    reverse=True,
                )

    return outputs

And then you can call it in class _GENREHubInterface with:

outputs = post_process_wikidata(
            outputs,
            text_to_id=text_to_id,
            marginalize=marginalize,
            batched_hypos=batched_hypos,
            marginalize_lenpen=marginalize_lenpen)

from genre.

NameError: name 'batched_hypos' is not defined (mGENRE) about genre HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent