Comments (2)
고맙습니다, 강열씨!
-
This is to stay a little bit safer from vanishing gradients. Consider a pixel where the ground truth value is 1.0. Then, without such range shift, the network will be trained to emit a large positive
output
value so thattanh(output) ≈ 1.0
. The possible danger is thatoutput
can happen to be so large that the gradient of tanh atoutput
will be too close to zero, hampering the training. After moving the range, gradients at extreme ground truth values become adequate while the output values are still limited.I've been using this trick since my very first experiments. If you remove it, I'm sure everything will still work. After all, many people successfully train generative nets with plain tanh. But I just haven't checked.
Even more, I think you can even use plain linear activation (no sigmoid, tanh or anything) instead of this trick, and everything will still work.
-
It worked in Zakharov et al. So we borrowed it as-is from there to save on experiments. Today we realize that your concern is indeed quite reasonable, and that BNs or INs are likely to improve the system.
from latent-pose-reenactment.
Thank you for your kind answer!
from latent-pose-reenactment.
Related Issues (20)
- About Graph HOT 4
- Question about preprocessing Voxceleb2 HOT 4
- Some error when loading seg net HOT 4
- I found quite amount of datasets are missed in train.csv HOT 2
- wrong index of inter-ocular in pose_reconstruction_error? HOT 2
- cv2.COLOR_BGR2RGB is missing at get_identity_descriptor? HOT 2
- Some questions about training from scratch HOT 3
- jitter output HOT 1
- How to reconstruct the original uncut image with the output image?
- How to generalize more facial expressions and improve resolution?
- RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable HOT 1
- Can we use the pretrained model to drive any arbitrary video? HOT 4
- How can I see the effect of this code running. HOT 3
- crop image is blurred and the generated image has artifacts
- Speed up preprocessing step? HOT 1
- Unexpected and missing key's in state-dict when constructing DeepLabv3+ model HOT 5
- Have no access to all checkpoints HOT 4
- Training progress slow HOT 1
- `X2Face` meta model HOT 1
- Exploring PyTorch Model Conversion, Inputs, Outputs, and Audio Integration for Video Generation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from latent-pose-reenactment.