Comments (8)
I'm using T5ForConditionalGeneration as the model.
Code I used to create the dict with constraints:
possible_dict = {}
for sen in df['output']:
toks = tokenizer.encode(sen)
for i in range(len(toks)-1):
if toks[i] not in possible_dict.keys():
possible_dict[toks[i]] = [toks[i+1]]
elif toks[i+1] not in possible_dict[toks[i]]:
possible_dict[toks[i]].append(toks[i+1])
all_keys = copy.deepcopy(list(possible_dict.keys()))
possible_dict[0] = all_keys
by this, when given sentence token like [0,4,7,5,8,1], possible_dict will look like {0: [4], 4:[7], 5:[8], 7: [5], 8: [1]}
def prefix_allowed_tokens_fn(batch_id, input_ids):
return possible_dict[int(input_ids[-1])]
for i, (_input, _output) in enumerate(zip(df['input'], df['output'])):
toks = tokenizer(_input, return_tensors="pt")
input_ids = toks['input_ids'].cuda()
attention_mask = toks['attention_mask'].cuda()
_predict = model.generate(
input_ids=input_ids,
attention_mask=attention_mask,
num_beams=10,
num_return_sequences=10,
prefix_allowed_tokens_fn=prefix_allowed_tokens_fn
)
_predict = tokenizer.batch_decode(_predict, skip_special_tokens=True)
_predict = list(set(_predict))
when I run this code, I get an error as below
which occurs because 4 is not inside the possible_dict key list.
Though 4 was not in the possible list of the token generated before, in other words when the generated token is (..., 53, 4), I could see that 4 is not in the list inside possible_dict[53]. Do you have any guess on why it would generate some tokens that are not inside the constrained list?
Thanks!
from genre.
Hello, can you post a piece of code where you observe a bug?
from genre.
@nicola-decao I think I'm getting the issue when num_beams
is larger than 1. It sometimes generates tokens that are not In the pre-constructed constrained dict. I haven't got the same issue when I try with num_beams=1
. Is this intended?
Thanks!
from genre.
@amy-hyunji you are defining a Trie with one possible output. This is why it fails when num_beams=1
. The model cannot possibly predict any other sequence.
from genre.
@nicola-decao Hi, I have one additional question. I tried to increase the beam size for case where the Trie may have one possible path, the one on the top issues. I noticed that in _get_from_trie
function in genre/trie.py
, if the corresponding input_ids is not in the trie_dict, the code returns an empty list. (
Line 79 in 1e24c13
from genre.
genre/trie.py
is not responsible for generation: it is just an implementation of a prefix tree. If you are using the huggingface T5 and you have some issues generating this is not the right repository to ask these questions but rather issues a comment on the transformer repository.
What I believe is happening is that the beam search function is selecting the top-k token to predict and if all tokens have zero probability (ie when the code returns an empty list) then it has to select something and it selects the first k tokens in the vocabulary.
from genre.
Hi @amy-hyunji I am facing the same issue (model generating a token outside of the constraint when num_beams >1), were you able to find a work around for this problem? Any ideas are greatly appreciated, thank you very much!
from genre.
@bryanzhou008 There is no problem here. Please read the discussion.
from genre.
Related Issues (20)
- is prefix_allowed_tokens_fn only working for seq2seq model.generate? HOT 2
- Loading mgenre models is taking 44GB RAM
- Problem in candidate-based generation on GENRE using transformers >= 4.36.0
- the same entity name question
- Inference speed is too slow. Is this problem because of Constrained beam search?
- can not receive different outputs from mGENRE.sample using dropout in train mode and different seeds HOT 2
- can't find ID to title map json file HOT 1
- alignment between candidate and KILT wikipedia data source HOT 4
- Question: Running genre on multiple GPUs HOT 1
- format of entries for entity linking training HOT 2
- Invalid prediction - no wikipedia entity HOT 10
- Fail to Reproduce the dev score of GENRE Document Retrieval HOT 7
- mGENRE finetuning issue
- Why do you prepend `eos_token_id' to sent_orig HOT 2
- colab script to run GENRE
- NameError: name 'batched_hypos' is not defined (mGENRE) HOT 5
- [Question] Evaluating mGENRE on Mewsli-9
- Fine-tune with hugging face trainer
- import package error
- Chinese entity linking
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from genre.