How to get the same output of model.sample with <code

Try: <div class="highlight highlight-source-python notranslate position-relative o

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

get_entity_spans and model.sample output in entity linking about genre HOT 4 CLOSED

facebookresearch commented on May 24, 2024

get_entity_spans and model.sample output in entity linking

from genre.

Comments (4)

nicola-decao commented on May 24, 2024 1

Try:

def _get_entity_spans(
    model,
    input_sentences,
    prefix_allowed_tokens_fn,
    redirections=None,
):
    output_sentences = model.sample(
        input_sentences,
        prefix_allowed_tokens_fn=prefix_allowed_tokens_fn,
    )

    output_sentences = [e[0]["text"] for e in output_sentences]
   
    return get_entity_spans_finalize(
        input_sentences, output_sentences, redirections=redirections
    )

from genre.

loretoparisi commented on May 24, 2024 1

It works perfectly thank you. I just added an import of from genre.utils import get_entity_spans_finalize.

from genre.

nicola-decao commented on May 24, 2024

get_entity_spans first calls a pre-processing step on the inputs:

GENRE/genre/utils.py

Line 97 in 3ac9343

def get_entity_spans_pre_processing(sentences):

and then a post-processing step of the outputs:

GENRE/genre/utils.py

Line 111 in 3ac9343

def get_entity_spans_post_processing(sentences):

To get the same output you should do the same.

from genre.

loretoparisi commented on May 24, 2024

@nicola-decao thank you. I would like that get_entity_spans had the same output of model.sample. Currently, the output of the latter (model.sample) seems to me more accurate than the former. So how to "force" get_entity_spans to have the same result of model.sample?

Examples

sentences = ["Tired of the lies? Tired of the spin? Are you ready to hear the hard-hitting truth in comprehensive, conservative, principled fashion? The Ben Shapiro Show brings you all the news you need to know in the most fast moving daily program in America. Ben brutally breaks down the culture and never gives an inch! Monday thru Friday."]

# model.sample
prefix_allowed_tokens_fn = get_prefix_allowed_tokens_fn(model, sentences)
out = model.sample(
    sentences,
    prefix_allowed_tokens_fn=prefix_allowed_tokens_fn,
)
print(out)

# get_entity_spans
entity_spans = get_entity_spans( model, sentences)
print(get_markdown(sentences, entity_spans))