Hi, I'm trying using your "langvar" branch to translate from Chinese

Got strange result while training translation from zh to en. about seq2seq-con HOT 3 OPEN

sachin19 commented on July 21, 2024

Got strange result while training translation from zh to en.

from seq2seq-con.

Comments (3)

EuphoriaYan commented on July 21, 2024

Well, I found that during training, - logcmk(kappa) is always ~ -420 and never change. torch.log(1 + kappa) * (self.lambda_vmf - (output_emb_unitnorm * target_emb_unitnorm).sum(dim=-1)) is decreasing from ~ 0.5. Is it abnormal?

from seq2seq-con.

EuphoriaYan commented on July 21, 2024

I tried using -approximate_vmf in args, found that logcmkappox(kappa, emb_size) is always ~ -690 and never change.

from seq2seq-con.

Sachin19 commented on July 21, 2024

Hi EuphoriaYan,

Apologies for such a long delay in my reply.

As you can see, the acc is decreasing and the perplexity is always zero.

Sorry, the statistics are not named correctly. They are named according to softmax-based models. "acc" here means "cosine distance", and x-ent means vMF loss. Perplexity is computed on top of the reported vMF loss which is 0 because vMF values are highly negative (so it's sort of meaningless). The only two losses worth monitoring here are "acc" and "x-ent" which by the trend looks find since they both should be decreasing. Also if you could let me know your final validation loss on this training set, I can judge if the model trained well or not. With good token embeddings, a cosine (acc) value of less than around 0.25 usually results in decent MT performance (for English).

./fasttext skipgram -input valid.en.bpetok -output emb/en -dim 300 -thread 8

You should train the embeddings on a larger training set, not the validation set. This method needs good quality embeddings to work. If you switch it to train.en.bpetok, you should be able to get better results. The English token embeddings (without BPE) that I used are provided here

/path/to/moses/scripts/tokenizer/tokenizer.perl -l zh -a -no-escape -threads 20 < train.zh > train.tok.zh

Not 100% sure if moses supports Chinese tokenization. This could be an issue.

Hope these suggestions resolve your issues :)

Sachin

from seq2seq-con.

Got strange result while training translation from zh to en. about seq2seq-con HOT 3 OPEN

Comments (3)

Related Issues (12)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent