Comments (9)
Same to me.
from edvr.
Yes, indeed we also found the training with DCN is unstable.
We will write down the issues we met during the competition in this repo later. And unstable training is one of them.
There are still a lot of things that we can improve on EDVR and we are also exploring some of them.
During the competition, we trained the large model from smaller ones and used a smaller learning rate for dcn. Even with these tricks, the over-large offsets are occasionally met. And we just resumed it from a normal checkpoint if we met.
from edvr.
What dou mean that you trained the large model from smaller ones.Or this one:"We initialize deeper networks by parameters from shallower ones for faster convergence"in your paper.
For instance. We use kaiming_normal initialize all parameter,then freeze TSA and Reconstruction Module,only request_grad in the PCD align and PreDeblur Module.
Thanks for your attention.
from edvr.
- Yes, we first train shallower ones.
- We will release some models and also the training codes to train from scratch. But their performances are not as good as the competition models.
from edvr.
谢谢大佬的回复,确实是很厉害的工作和研究。
我们正在尝试先把可变卷积换成正常的卷积,然后训练得到的初始model,然后用这个模型训练网络。
接着冻结部分模型块再开始训练。
from edvr.
Actually, DCN is relatively important. So you can first train a small network with DCN (w/o TSA).
We are running these experiments and will release it as soon as possible.
from edvr.
1、“We trained the large model from smaller ones and used a smaller learning rate for dcn.”
Do you mean this(for example):
step 1>5front-10back with DCN+TSA,lr=1e-4,(model S(hallow)).
step 2>5front-40back with DCN+TSA,lr(DCN)=5e-5(e.g.),lr_others=1e-4. And parameters of S except 30 back blocks is copied to model D(eep).
2、"You can first train a small network with DCN (w/o TSA)"
Do you mean, only DCN is needed to be pretrained, another paramters after DCN is not needed(not useful for deeper model).
For example, I can train 5front blocks with DCN, w/o TSA, and with very shallow SR network after DCN.
Then, the DCN is pretrained, paramters after DCN can be abandoned, and I can change SR network whatever I like after DCN?
3、This pretrained-DCN-trick can't make the final model D with a deeper or wider(I mean, change the feature extraction layers before DCN) DCN module compared with model S, because DCN paramters are needed to be copied. Is it right?
4、For the second step, there are two choices for DCN. The first one, smaller lr for DCN. The second one, freeze DCN module. The second choice can save many time and GPU memory for training. Is it suitable?
@xinntao
from edvr.
We have updated the training codes and configs. We provide training scripts for the model with Channel=128, Back RB=10.
The learning rate scheme is different from that in the competition. But it is more effective.
- train with the script
train_EDVR_woTSA_M.yml
- then train with the script
train_EDVR_M.yml
You can try this.
from edvr.
谢谢大佬的回复,确实是很厉害的工作和研究。
我们正在尝试先把可变卷积换成正常的卷积,然后训练得到的初始model,然后用这个模型训练网络。
接着冻结部分模型块再开始训练。
have you succeed?how about the effect?
from edvr.
Related Issues (20)
- RuntimeError: Error compiling objects for extension
- the problem of python setup.py develop HOT 2
- the problem of PCD_Align HOT 1
- TypeError: DCNv2Pack() takes no arguments HOT 3
- AssertionError for assert x.size()[-2:] == flow.size()[1:3]
- The way to compute GFlops of deformable convolution
- What can be reason why I get this error: TypeError: DCNv2Pack() takes no arguments ? HOT 2
- 关于测试问题
- colab implementation of EDVR HOT 1
- NameError: name 'deform_conv_ext' is not defined HOT 1
- Results in MSU Video Super Resolution Benchmark
- EDVR: num_out_ch completely useless
- train on vimeo90K HOT 1
- Is REDS4_test correct paper split or REDSofficial4_test HOT 1
- what is woTSA? HOT 4
- Ringing artifacts on the output
- EDVR TSA模块里的点积运算符为什么是* 不应该是@吗
- EDVR colab? HOT 5
- what is the mean of the mask
- No such file or directory: 'experiments/103_EDVR_L_x4_SR_REDS_woTSA_600k_B4G8_valREDS4_wandb/models/net_g_600000.pth'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from edvr.