yxuansu / nag-bert Goto Github PK
View Code? Open in Web Editor NEW[EACL'21] Non-Autoregressive with Pretrained Language Model
Home Page: https://arxiv.org/abs/2102.08220
License: Apache License 2.0
[EACL'21] Non-Autoregressive with Pretrained Language Model
Home Page: https://arxiv.org/abs/2102.08220
License: Apache License 2.0
Hi, thanks for the coming soon source code.
I have two questions about the sequence length dynamic adjustment.
[eos]
s to indicate the end of the sequence. But at the intermediate of the sequence, it is still possible to generate a single [eos]
, e.g., I ate an [eos] apple [eos] [eos]
, and you need to remove all these intermediate [eos]
s, is this correct?
[eos]
s instead of a single [eos]
? You have mentioned "Once the decoded trajectory enters the [eos] state, the state transition term in S(X, Y_0) will be dominated by the transition score term t([eos]
, [eos]
)", so the point here is to make [eos]
a black hole? Once decoding trajectory transits to [eos]
, it will not have a chance to get out? If this is correct, then why not simply set all [eos] -> non-[eos]
transitions very negative weights and do not update them during training?I ate an apple
and the length of the source sequence is 9, which of the following do you use to train the model as the target?
I ate an apple [eos] [eos]
I ate an apple [eos] [eos] [eos] [eos] [eos]
Hope I can get your reply, and thanks~
In Gigaword dataset there are some examples where the summary is longer than the source sequence. Sometimes the sourse is a single unk
word. As I can see in dataclass.py
, such examples are dropped from the pipeline completely.
Were the rouge scores reported in the paper computed without those examples? If yes, then it is incorrect to compare the resulting scores with the baselines. For example, as I can see, the rouge scores for Concept Pointer were taken directly from the paper, where they measured the performance on all test examples.
Hello~ Thanks for your code.
Could you release the scripts for reproducing the results of machine translation task? Thank you very much~
I saw the paper use argmax as the equation to obtain the sequence.
I understand that that would be a Viterbi algorithm, where the complexity is again O(n).
I'm confused that how is it faster than Auto-Regressive approach
I couldn't find code in this repository.
Is there any alternative link to this research?
Hi, thanks for a great paper and code repository!
When looking at your code inside the main.py
file, I see that you used the regular negative log-likelihood loss and I haven't seen in the code any reference to the unlikelihood loss of the context-aware (with a context window of size c
) term that is mentioned on the paper.
Can you please point out to where in the code this loss is configured? Thanks.
Hi! Thank you for sharing your code.
Is it possible for you to share the instructions for reproducing the results for the sentence compression task as well? It would be really helpful.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.