Hi.
Thanks for your awesome project! For t5-small, I got the MT result on validation set. That is:
"eval_bleu": 26.1472,
"eval_gen_len": 42.1916,
"eval_loss": 1.4190454483032227,
"eval_runtime": 216.1581,
"eval_samples": 1999,
"eval_samples_per_second": 9.248,
"eval_steps_per_second": 2.313
However, for KNN-MT, i got a different result. That is:
"eval_bleu": 32.0026,
"eval_gen_len": 42.1126,
"eval_loss": 0.40791189670562744,
"eval_runtime": 4053.3114,
"eval_samples": 1999,
"eval_samples_per_second": 0.493,
"eval_steps_per_second": 0.123
and the speed is too slow that i wonder if there is some wrong in my shell?
KNN-MT Shell is:
meta_path=path_to_project
model_name=t5-small
model_path=path_to_all_model/${model_name}
python -u $meta_path/knn-transformers/run_translation.py
--model_name_or_path ${model_path}
--dataset_name wmt16 --dataset_config_name ro-en
--per_device_eval_batch_size=4
--output_dir $meta_path/checkpoints-translation/$model_name-datastore
--source_lang en --target_lang ro
--do_eval
--predict_with_generate
--source_prefix "translate English to Romanian: "
--dstore_dir $meta_path/checkpoints-translation/$model_name-datastore
--knn_temp 50 --k 32 --lmbda 0.25
--knn
original MT Shell is:
meta_path=path_to_project
model_name=t5-small
model_path=path_to_all_model/${model_name}
python -u ${meta_path}/knn-transformers/run_translation.py
--model_name_or_path ${model_path}
--dataset_name wmt16 --dataset_config_name ro-en
--per_device_eval_batch_size=4
--output_dir $meta_path/checkpoints-translation/$model_name
--source_lang en --target_lang ro
--do_eval
--predict_with_generate
--source_prefix "translate English to Romanian: "
I notice that if i delete the predict_with_generate
in KNN-MT shell, the speed will be the same as the original MT and the eval_loss
is also the same as original MT. But i can not get the eval_bleu
. Like:
eval_loss = 0.4079
eval_runtime = 0:02:16.85
eval_samples = 1999
eval_samples_per_second = 14.606
eval_steps_per_second = 3.653.
However, set predict_with_generate
will not affect the speed of original MT. Could you please give some instruction to solve this problem?
Thanks!