Comments (4)
Actually, if you look at the definition of SDS, you will see that it is important that you use the noised generated output. That's because you can then interpret the loss in a nice way; it simplifies to predicting the noise at timestep t.
I think the intuition is that we need the generated output to be in the distribution of the teacher model. Using the noised student output just seems to give a better objective, at least mathematically - one has a ground truth noise to compare against.
from generative-models.
One plausible reason is the timesteps s and t, are sampled independently for the student and teacher models respectively. However, this is not a hard obstacle.
The simpler explanation is that it inherits from score distillation sampling which originates in the 3D generation domain (see Dream Fusion), where the inputs to the teacher (image model) and student (3D model producing differentiable 2D rendering) differ vastly.
This provokes the open question of whether feeding the noised original image rather than the noised student output will lead to better or worse results (whether from a final loss or loss curve perspective).
from generative-models.
rather than the same noise inputs of student nets
Note that your original suggestion is invalid. Due to the differing choice of noise sampling.
What is valid is the question of which input to noise.
from generative-models.
@qp-qp I'm not sure if you're privileged to share, but I think this is an interesting question that is worth shedding light on.
I have a feeling that feeding the original image may lead to degenerate results, as it simply amplifies the original dataset.
If the teacher models is perfectly faithful on the dataset, you would reproduce training on the original dataset.
Perhaps what is beneficial about distillation I.e. feeding the generated output is that it generates new diversity for the teacher model to provide feedback on.
However I am uncertain if any of these thoughts are valid.
from generative-models.
Related Issues (20)
- Param 'repeat' is None by default
- SV3D-Would it be feasible to provide more than one view angle as input to better the quality of the outputs
- Do you plan to release the Camera Motion LoRA model and code in the future ?
- how to compute motion_bucket_id of a specific video
- After finetunig the SVD model by MNIST. How to test it? HOT 2
- Large memory requirement when implementing sds loss using SV3D HOT 2
- Is anyone trying to train their own SVD-based Camera Motion LoRA model? HOT 5
- Classifier Free Guidance for StableVideoDiffusion
- DiffusionEngine miss 'test_step' define HOT 1
- How to run SV3D_u on MacBook with M1 chip
- UnboundLocalError: local variable 'input_image' referenced before assignment
- A Driving World Model based on SVD (incl. all training code)
- Plans for UnClipXL
- Orbit Motion of camera in Image to video
- generated video from sd3d_p is more than 20M and cannot be played HOT 1
- The CKPT_PATH of the txt2img-clipl.yaml
- How to run this repo on cpu with autocast enabled?
- svd effect
- Code of Stable3 medium released ?? HOT 1
- How to merge the vae encoder weights in svd-xt to first_stage_model.encoder in sv3d_p.safetensors ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from generative-models.