neulab / knn-transformers Goto Github PK

PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT

License: MIT License

Python 100.00%

huggingface knn-lm language language-models models nearest nearest-neighbor neighbor neuro-symbolic pytorch

knn-transformers's People

Contributors

Stargazers

Watchers

knn-transformers's Issues

saving-a-datastore-for-knn-mt section in README is missing a proper dstore_size parameter

saving-a-datastore-for-knn-mt section in README is missing a proper dstore_size parameter.

to add

--dstore_size 26565876 \

current version

MODEL=t5-small

python -u run_translation.py  \
  --model_name_or_path ${MODEL} \
  --dataset_name wmt16 --dataset_config_name ro-en \
  --per_device_train_batch_size 4 --per_device_eval_batch_size=4 \
  --output_dir checkpoints-translation/${MODEL} \
  --source_lang en --target_lang ro \
  --dstore_dir checkpoints-translation/${MODEL} \
   --save_knnlm_dstore --do_eval --eval_subset train \
   --source_prefix "translate English to Romanian: "

correct version (maybe)

MODEL=t5-small

python -u run_translation.py  \
  --model_name_or_path ${MODEL} \
  --dataset_name wmt16 --dataset_config_name ro-en \
  --per_device_train_batch_size 4 --per_device_eval_batch_size=4 \
  --output_dir checkpoints-translation/${MODEL} \
  --source_lang en --target_lang ro \
  --dstore_dir checkpoints-translation/${MODEL} \
  --save_knnlm_dstore --do_eval --eval_subset train \
  --dstore_size 26565876 \
  --source_prefix "translate English to Romanian: "

automaton: modify database

Is it possible to modify the database for the automaton during inference time, i.e. add new data on-the-fly? Or do we need to reconstruct the automaton whenever the database changes?

I was wondering about the scenario, where additional information is collected while interacting with the LM.

How to apply knn-transformers to a custom pretrained machine translation model?

Hi @urialon ,

Thank you very much for releasing the source code which applies kNN for the Machine Translation task. However, only pre-trained models available on the Huggingface hub seem valid.
Recently, I designed and trained a custom NMT model for the Low-Resource Machine Translation task using PyTorch and Transformers libraries. In the inference phase, I used the method: my_pretrained_model.generate(input_ids) as the traditional translation to translate a testing input sentence.
Using your source code, I want to apply kNN for my_pretrained_model in the inference phase. However, since my model is custom, I wonder how to do that.
Please guide me in applying your source code to use kNN for my custom pre-trained NMT model.
Many thanks for your help!

NonMatchingSplitsSizesError

Hi, I encountered NonMatchingSplitsSizesError when evaluating the finetuned model: gpt2-finetuned-wikitext103. The same also popped up when Saving a Datastore and Building the FAISS index by myself. Would you mind indicating how to solve this issue? Thank you very much!

Could KNNSaver support Multi-GPU strategies like DDP?

Hi, @urialon
I am trying to evaluate a model using the DDP strategy but met an error because such strategies will try to write the datastore asynchronously.

Using only one GPU then everything works really well but it is kinda slow.

Any idea? Thanks!

The size of the dstore

Hi, I'm wondering how to set the knn_args.dstore_size if I use my own data to construct the datastore?

TypeError: pre_forward_hook() missing 1 required positional argument: 'labels'

Hi~, I copied kmm_lm.py files into my project and used _knn_wrapper.break_into( my model)_and tried to save a datastore as what instructions do in KNN-MT, but unfortunately, I met some problems when running the code:

Unsupported operand type(s) in LM evaluation

Hi, can I ask a question regarding the evaluating kNN-LM and RetoMaton? I used the preprocessed Wikitext-103 datastores and FAISS index from gpt-2 and distilgpt-2(downloading form the link) and encountered the 'unsupported operand type(s)' issue for both conditions. Would you mind indicating some possible solutions?

Thank you very much for your kind help!

Performance on neulab/gpt2-large-finetuned-wikitext103

Hi, @urialon
Thanks for your great work.
I just tested neulab/gpt2-large-finetuned-wikitext103 without and with --knn but could not observe an improvement...
ppl 10.5565 vs ppl 10.6538
Any idea about this? Should I tune the hyperparameters like temperature?
Thanks.

Mismatch on KNN-MT result on README

Hi.
Thanks for your awesome project! For t5-small, I got the MT result on validation set. That is:

"eval_bleu": 26.1472,
"eval_gen_len": 42.1916,
"eval_loss": 1.4190454483032227,
"eval_runtime": 216.1581,
"eval_samples": 1999,
"eval_samples_per_second": 9.248,
"eval_steps_per_second": 2.313

However, for KNN-MT, i got a different result. That is:

"eval_bleu": 32.0026,
"eval_gen_len": 42.1126,
"eval_loss": 0.40791189670562744,
"eval_runtime": 4053.3114,
"eval_samples": 1999,
"eval_samples_per_second": 0.493,
"eval_steps_per_second": 0.123

and the speed is too slow that i wonder if there is some wrong in my shell?
KNN-MT Shell is:

meta_path=path_to_project
model_name=t5-small
model_path=path_to_all_model/${model_name}
python -u $meta_path/knn-transformers/run_translation.py
--model_name_or_path ${model_path}
--dataset_name wmt16 --dataset_config_name ro-en
--per_device_eval_batch_size=4
--output_dir $meta_path/checkpoints-translation/$model_name-datastore
--source_lang en --target_lang ro
--do_eval
--predict_with_generate
--source_prefix "translate English to Romanian: "
--dstore_dir $meta_path/checkpoints-translation/$model_name-datastore
--knn_temp 50 --k 32 --lmbda 0.25
--knn

original MT Shell is:

meta_path=path_to_project
model_name=t5-small
model_path=path_to_all_model/${model_name}
python -u ${meta_path}/knn-transformers/run_translation.py
--model_name_or_path ${model_path}
--dataset_name wmt16 --dataset_config_name ro-en
--per_device_eval_batch_size=4
--output_dir $meta_path/checkpoints-translation/$model_name
--source_lang en --target_lang ro
--do_eval
--predict_with_generate
--source_prefix "translate English to Romanian: "

I notice that if i delete the predict_with_generate in KNN-MT shell, the speed will be the same as the original MT and the eval_loss is also the same as original MT. But i can not get the eval_bleu. Like:

eval_loss = 0.4079
eval_runtime = 0:02:16.85
eval_samples = 1999
eval_samples_per_second = 14.606
eval_steps_per_second = 3.653.

However, set predict_with_generate will not affect the speed of original MT. Could you please give some instruction to solve this problem?

Thanks!

Got Error when running kNN-MT with T5-base

Hi @urialon ,

Thanks for releasing your interesting source code. Following your guidance, I have tried to run kNN-MT when to evaluate the T5-base model.
Here is the error image. The error is about "OverflowError: out of range integral type conversion attempted". I tried to increase the max_target_length (from 128 to 512, 1024) but still got error.
What should I do to fix it? Thank you for your help!

neulab / knn-transformers Goto Github PK

knn-transformers's People

Contributors

Stargazers

Watchers

Forkers

knn-transformers's Issues

to add

current version

correct version (maybe)

Recommend Projects

Recommend Topics

Recommend Org