Comments (9)
I have the same problem, @glample . Can u do me a favor?
from unsupervisedmt.
Hi,
So I'm not sure about your issue, I have not tried en-zh. However, the approach should work for en-zh, I know that this paper did it: https://arxiv.org/pdf/1804.09057.pdf Maybe could you try to use their same setup / datasets / preprocessing etc?
How big are your monolingual corpora?
Also what is ch
in wiki.ch.300.vec.20w
? Isn't this code for Chamorro and not Chinese which is zh
? You can use your own corpus, it's probably better, because this way you have embeddings associated with your tokenization / text pre-processing, but if your corpora are small or if you don't have good P@1 accuracy then the fastText ones are probably better.
Also can you try MUSE with the script supervised.py --dico_train identical_char
instead of unsupervised.py
? It will align words by taking as anchor points words that are identical in both languages. It sometimes works better than adversarial, even for distant languages.
from unsupervisedmt.
ok , thanks
from unsupervisedmt.
Hi,
I did the same task on zh-en, and the trained models cannot be used to translate test sets, why? Like this:
The train.log of MUSE as follow:
train.log
This problem has been bothering me for a long time, can you give me some guidance? @glample
from unsupervisedmt.
The train.log of MUSE seems reasonable. Not sure about the message returned by Moses, it's a Moses specific issue. Is it just a warning? What is at the end of the log? Did you have a look at the phrase table you generated? Does it look good?
from unsupervisedmt.
Thank you for your reply.
The end of the log is like this:
And I think the phrase table looks good.
from unsupervisedmt.
Hi,
@socaty @cocaer
Thanks very much!!!
when I generated the phrase-table and tried translating test sentences, error occur as follow:
can you give me some advise?
Linking phrase-table path...
Translating test sentences...
Defined parameters (per moses.ini or switch):
config: /data/home/super/mt/dataset/muti-domain/unmt/moses_train_en-zh/model/moses.ini
distortion-limit: 6
feature: UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryMemory name=TranslationModel0 num-features=2 path=/data/home/super/mt/dataset/muti-domain/unmt/moses_train_en-zh/model/phrase-table.gz input-factor=0 output-factor=0 Distortion KENLM name=LM0 factor=0 path=/data/home/super/mt/dataset/muti-domain/unmt/data/zh.lm.blm order=5
input-factors: 0
mapping: 0 T 0
threads: 48
weight: UnknownWordPenalty0= 1 WordPenalty0= -1 PhrasePenalty0= 0.2 TranslationModel0= 0.2 0.2 Distortion0= 0.3 LM0= 0.5
line=UnknownWordPenalty
FeatureFunction: UnknownWordPenalty0 start: 0 end: 0
line=WordPenalty
FeatureFunction: WordPenalty0 start: 1 end: 1
line=PhrasePenalty
FeatureFunction: PhrasePenalty0 start: 2 end: 2
line=PhraseDictionaryMemory name=TranslationModel0 num-features=2 path=/data/home/super/mt/dataset/muti-domain/unmt/moses_train_en-zh/model/phrase-table.gz input-factor=0 output-factor=0
FeatureFunction: TranslationModel0 start: 3 end: 4
line=Distortion
FeatureFunction: Distortion0 start: 5 end: 5
line=KENLM name=LM0 factor=0 path=/data/home/super/mt/dataset/muti-domain/unmt/data/zh.lm.blm order=5
FeatureFunction: LM0 start: 6 end: 6
Loading UnknownWordPenalty0
Loading WordPenalty0
Loading PhrasePenalty0
Loading Distortion0
Loading LM0
Loading TranslationModel0
Start loading text phrase table. Moses format : [0.502] seconds
Reading /data/home/super/mt/dataset/muti-domain/unmt/moses_train_en-zh/model/phrase-table.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Exception: moses/TranslationModel/RuleTable/LoaderStandard.cpp:202 in bool Moses::RuleTableLoaderStandard::Load(const Moses::AllOptions&, Moses::FormatType, const std::vector&, const std::vector&, const string&, size_t, Moses::RuleTableTrie&) threw util::Exception because `isnan(score)'.
Bad score -- on line 119910
from unsupervisedmt.
@wingsyuan
Sorry, I am not sure about the reason of your problem.
In my opinion, is it wrong when training phrase-table? Please check your training process,and I will send my traing script to your email soon.
from unsupervisedmt.
@wingsyuan I have the same problem here. Have you solved this?
from unsupervisedmt.
Related Issues (20)
- why MemoryError
- Why codes file is empty.? HOT 4
- for different language, where to make change?
- How to train NMT + PBSMT ?
- UnboundLocalError: local variable 'n_words' referenced before assignment
- About number of shared layers
- RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [14, 32, 1536]], which is output 0 of AddBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True). HOT 1
- How to run PBSMT +NMT ?
- transformer multihead attention scaling layer error
- Setting the random seed does not result in same outputs across runs
- I have trouble when run get_data_enfr.sh
- How can I modify the code to train may own dataset on specific language?
- Low utilization rate of cuda HOT 1
- How to train the vector of phrases
- Low BLEU on PBSMT HOT 3
- bpe_end issue
- Getting raise EOFError() while executing Linux Command through Netmiko
- How i can run MUSE alignment in .sh
- How to train the model without para_dataset
- Error in runny bash command. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unsupervisedmt.