Git Product home page Git Product logo

diffae's People

Contributors

chenxwh avatar nessessence avatar phizaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffae's Issues

Some error in the evaluation stage

Thanks for your amazing work! I try to train on customized dataset, when after 2 days training, I got some error in the calcuation of lpips, the traceback as follow:

File "/share_graphics_ai/linminxuan/Workspace/diffusion-models/diffae/experiment.py", line 938, in train
  trainer.fit(model)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
  self._run(model)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 917, in _run
  self._dispatch()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 985, in _dispatch
  self.accelerator.start_training(self)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
  self.training_type_plugin.start_training(trainer)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
  self._results = trainer.run_stage()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 995, in run_stage
  return self._run_train()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 1044, in _run_train
  self.fit_loop.run()
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/loops/base.py", line 111, in run
  self.advance(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/loops/fit_loop.py", line 200, in advance
  epoch_output = self.epoch_loop.run(train_dataloader)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/loops/base.py", line 111, in run
  self.advance(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 150, in advance
  "on_train_batch_end", processed_batch_end_outputs, batch, self.iteration_count, self._dataloader_idx
File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 1226, in call_hook
  output = hook_fx(*args, **kwargs)
File "/share_graphics_ai/linminxuan/Workspace/diffusion-models/diffae/experiment.py", line 431, in on_train_batch_end
  self.evaluate_scores()
File "/share_graphics_ai/linminxuan/Workspace/diffusion-models/diffae/experiment.py", line 622, in evaluate_scores
  lpips(self.model, '')
File "/share_graphics_ai/linminxuan/Workspace/diffusion-models/diffae/experiment.py", line 611, in lpips
  latent_sampler=self.eval_latent_sampler)
File "/share_graphics_ai/linminxuan/Workspace/diffusion-models/diffae/metrics.py", line 111, in evaluate_lpips
  latent_sampler=latent_sampler)
TypeError: render_condition() got an unexpected keyword argument 'latent_sampler'

There is no "latent_sampler" keyword in "render_condition" function, I guess the latent_sampler should use in the "render_uncondition" case. Should I delete this key?

Please provide a license

Hi, congratulations on your excellent work!
I wonder whether you can provide a license for your code.
Thanks.

Error "AssertionError: 32 != 4" while training

Epoch 1: 14% 2500/17500 [51:03<5:06:23, 1.23s/it, loss=0.0144, v_num=]Traceback (most recent call last):
File "/content/diffae/run_ffhq256.py", line 10, in
train(conf, gpus=gpus, nodes=nodes)
File "/content/diffae/experiment.py", line 937, in train
trainer.fit(model)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 1112, in _run
results = self._run_stage()
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 1191, in _run_stage
self._run_train()
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 1214, in _run_train
self.fit_loop.run()
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 230, in advance
self.trainer._call_lightning_module_hook("on_train_batch_end", batch_end_outputs, batch, batch_idx)
File "/usr/local/lib/python3.9/dist-packages/pytorch_lightning/trainer/trainer.py", line 1356, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/content/diffae/experiment.py", line 429, in on_train_batch_end
self.log_sample(x_start=imgs)
File "/content/diffae/experiment.py", line 569, in log_sample
do(self.model, '', use_xstart=True, save_real=True)
File "/content/diffae/experiment.py", line 498, in do
gen = self.eval_sampler.sample(model=model,
File "/content/diffae/diffusion/base.py", line 208, in sample
return self.ddim_sample_loop(model,
File "/content/diffae/diffusion/base.py", line 735, in ddim_sample_loop
for sample in self.ddim_sample_loop_progressive(
File "/content/diffae/diffusion/base.py", line 795, in ddim_sample_loop_progressive
out = self.ddim_sample(
File "/content/diffae/diffusion/base.py", line 600, in ddim_sample
out = self.p_mean_variance(
File "/content/diffae/diffusion/diffusion.py", line 96, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args,
File "/content/diffae/diffusion/base.py", line 307, in p_mean_variance
model_forward = model.forward(x=x,
File "/content/diffae/diffusion/diffusion.py", line 153, in forward
return self.model(x=x, t=do(t), t_cond=t_cond, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/diffae/model/unet_autoenc.py", line 149, in forward
assert len(x) == len(x_start), f'{len(x)} != {len(x_start)}'
AssertionError: 32 != 4
Epoch 1: 14%|█▍ | 2500/17500 [51:21<5:08:09, 1.23s/it, loss=0.0144, v_num=]

I got an error like that while I tried to resume the training ffhq256. How do I fix it?

Train AutoEncoder Only

Hi,
Can we train the autoencoder only, by fixing the ddim?
I want to train an autoencoder on a feature vector of size 64x64x256 and expect to get the z_sem which can work with the pretrained ddim.
The feature vector was generated from an image using a different U-Net architecture. The feature vector has all the information of the original image, as we can easily transfer it back to the original image using the decoder of U-Net model.
Now using the original image, I got the z_sem from the pre-trained diffae autoencoder, which can be used as a ground truth.
Is there a some to train only the autoencoder with the feature vector and the ground truth z_sem?

Question about AdaGN with conditioned Z

Hello! Nice work!
May I ask a relatively stupid question about HOW $z_{sem}$ is add to the UNet ?
Let say h is the previous layer output. In your paper, such $z_{sem}$ adding is like:

$out = Affine(z_{sem}) * (h * MLP_1(\phi _1 (t) ) + MLP_2 (\phi _2 (t)) )$

That is quite wired! Why choose a times operation here?

I found a different understanding of $z_{sem}$ adding:

$temp = (h * MLP_1(\phi _1 (t) ) + MLP_2 (\phi _2 (t)) )$

$Affine(z_{sem})=s, c$

$out = s * temp + c$

Is this understanding right?

I don't know the blue "times" in appendix figure 7 (a) mean. I suspect that it is not a "times" operation?

FFHQ Custom class finetuning

Hi Konpat and Diffae team,

Thank you so much for your work! I am trying to create a new class for the head pose(actually 3: pitch, yaw, roll) and finetune it on FFHQ model. It's not a binary classification, so I assume d2c method of having positives and negatives won't work(I have concrete values for each), I need to switch the training loss function from binary cross-entropy to mse. Do I understand correctly that I need to finetune ffhq256_autoenc/latent.ckpt checkpoint? Is it the checkpoint of the encoder?

Thanks,
Richard

How to get Figure 10 in the paper?

Could you give a example to get Figure 10 in the paper?
From my understanding, stochastic subcode is random sampe from DDIM.
Could you show the codes?
Looking forward to it.

EMA model lagging behind

Thanks so much for awesome work!
After around 1 Million images my model samples already look pretty good, but the ema samples are still very noisy (FID of 450 vs. 76). Does this make sense?
I thought in general the ema model is supposed to have better and more stable results.

Request for Download Link

Hi,
Could you please provide the download link for DiffAE (with latent DPM, can sample): [FFHQ256]? Just like the links you provided in the DiffAE Manipulation demo for Colab (simplified version).ipynb.
Thank you!

Nice work! About "Predictive power of the semantic subcode".

Hi, Nice work!!🎉🎉
Do you have a plan to release evaluation code about "Predictive power of the semantic subcode" (Table 8 in paper)?
I don't know how to reproduce this result.
I would appreciate it if you could provide the code!!
Thank you!

About Stochastic encoder

Thanks for your excellent work! It is very inspiring!
I have a question about Stochastic encoder.
Equation 8 in the paper is described as the reverse of equation 1. Equation 8 uses the U-Net ϵθ(xt, t, z) trained in training process to generate x_t+1 from x_t. However, as far as i can see, the ϵθ was trained for denoising from x_t to generate x_t-1.
More specificly, ϵθ is used to predict noise that already exist in x_t, why Stochastic encoder uses the noise that is predicted to be exist currently by ϵθ to map the picture to latent space?
Thanks for your answering!

how the model ensure semantic information goes to z rather than x_t?

Dear diffae team,

Thank you for sharing this great work, I really enjoy it.

I paly with the model a bit. When I fix z and randomly sample x_t, the output images are almost the same, with some small variaces. how the model ensure semantic information goes to z rather than x_t (starting gaussian noise)? Maybe because the model is trained with reconstruction loss, z is the only source for the target image information, and x_t is randomly sampled even for the same target image. So the model learn to 'ingore' x_t and therefore x_t only contains little semantic information.

In this repo, Justin takes the Unet from stable diffusion and the image encoder from CLIP, and train the model using image reconstruction loss. If my intuition in last paragraph is correct, then given the same image as input, we randomly sample x_t, the output images will be almost the same. But this not the case in Justin's model, changing x_t causes a large change (including semantic change) in output image. This mean the x_t in Justin's model contain more semantic information than the x_t in DA model. Could you tell me why this is happening?

Thank you for your help.

Best Wishes,

Zongze

Question about using the Latent DDIM to Generate $z_{sem}$

Hello DiffAE team, thank you for this work. It's absolutely amazing.
I've been playing around with the codebase, and seem to be lost with how to generate the latent z_sem on its own.

I've been using the third checkpoint FFHQ256:

DiffAE (with latent DPM, can sample): FFHQ256

and seem to be a bit lost with how the LiT module operates. If I'm not mistaken, this checkpoint should allow sampling z_sem directly, like how an image is sampled unconditionally in a vanilla DPM, but can't seem to find the right method to call.

Also, this checkpoint should allow to call upon the model.render() function, but I'm hit with the NotImplementedError.

Am I not getting something right with the module itself? It would be great if I could just call conf = ffhq256_autoenc_latent() upon the model and sample z_sem directly, and also play around with the model.render() function.

Any type of help would be much appreciated.

Cheers,

Tom

Inference DiffAE as AutoEncoder

Hi, thank you for sharing this nice works.

Can you share some example code for how to use DiffAE as auto encoder?

some .ipynb file would be great.

thanks.

Deterministic Reconstruction Error

Upon reading the paper, specifically the part about the deterministic reconstruction with ddim,
I can not seem to understand why the reconstruction is not exact.
You talk about exact reconstruction and also mention reconstruction loss.
Why, if the formulas are deterministic and the output of the network
is also deterministic, is the reconstruction not exact?
Thanks in advance, Anthony Mendil.

about fine-tune diffae

Thanks for the excellent work.
I want to train diffae on my dataset, due to equipment and time reasons, I want to fine-tune on the basis of the model you released instead of training from scratch, have you tried fine-tuning model that you have already trained on another dataset? Or, do you think this solution is feasible? If it is feasible, further, taking ffhq256 as an example, how many samples should I fine-tune?

stochastic subcode xT

Dear diffae group,

Thank you for sharing this great work.

In section 3.1, you mention that 'for training, the stochastic subcodexT is not needed.' You mean X_T is freezed during training? From time T to T-1, we need to have X_T according to equation 6, right?

Thank you for your help.

Best Wishes,

Zongze

no definition for ModelCheckpoint

no definition for ModelCheckpoint:

checkpoint = ModelCheckpoint(dirpath=f'{conf.logdir}',
                                 save_last=True,
                                 save_top_k=1,
                                 every_n_train_steps=conf.save_every_samples //
                                 conf.batch_size_effective)

About the peformance of running "sbatch run_ffhq256.py"

Thanks for the excellent work.
I have tried to run sbatch run_ffhq256.py with 1 node 8 GPUS.
In comparison to the released model last.ckpt
I found some differences in performance.

My evalution code was based on https://github.com/phizaz/diffae/blob/master/autoencoding.ipynb with a change in how to get XT.

xT = torch.randn(len(cond),3,conf.img_size,conf.img_size,device=cond.device)

For the released model, I got:
image

For my training model, I got
image

The difference is even larger when I test both in other images (different dataset)

For the released model, I got:
image

For my training model, I got
image

It shows the released model is very robust in different situations but not for my version.

Did I mistake anything in code?

reconstruct blur/noise image

thank you for your work
can i use the auto encoder to remove noise/blur from images?
in the paper i can see the following example

Screenshot from 2022-08-16 16-17-36
so can i do something similar?
insert image with noise and get reconstructed non noised image ?
thank you

about datasets

hi, thanks for your great work, when I try to train the model using python run_ffhq128.py, it show me en error like:
image
my file tree is as follows:
image
Could you please help find out what's wrong with my file tree please?
Really thanks and hope for your reply.

Sampling without noise during training

First of all, thanks for your great work!
I am trying to understand how to properly use deterministic noising and denoising.
From your code (in particular the file diffae/diffusion/base.py) I get that
there are function to do this:
ddim_reverse_sample_loop for the deterministic noising and ddim_sample_loop for
the deterministic denoising.
However, it seems like these are never used during training but only after training.

For the denoising during training, only the function q_sample is used and the noise
parameter is always set to none so that gaussian noise is added.

So it seems like during training the denoising process is not deterministic but then after training
you use the deterministic functions mentioned before.

Is this observation correct? I find it hard to understand the reason for this difference.
Is it desired that the generation is non-deterministic during training and only deterministic after training?

More precisely, I was expecting that the ddim_reverse_sample_loop would also be used
during training so that the latent variables of the same images are also equal during training
(due to the deterministic property of ddim).

I would very much appreciate it you could clear up my confusion.
Thanks in advance, Anthony Mendil.

network architecture

Dear Diffae team,

Thank you for sharing this great work, I really enjoy it.

I understand that the Unet architecture you used is based on guided diffusion model. Unfortunately, they do not provide a figure to visualize the network architecture, it is very hard for me to understand its architecture. Would you mind providing a figure to explain the structure of the Unet you used? From the codes, the Unet seems to contain 3 blocks (input, middle, output).

Thank you for your help.

Best Wishes,

Zongze

How many epochs required for training?

When digging in your code, I found that training is based on a number of total iterations (let's call max_steps). Based on your codes in experiment.py, it is computed as max_steps=conf.total_samples // conf.batch_size_effective.

total_samples is predefined at template.py (e.g. 130_000_000 for ffhq128) and batch_size_effective is set to 128 by default. For this example, max_steps = 1_015_625. As FFHQ128 includes 70,000 samples, a number of required epochs are 1_015_625 / (70_000 / 128)) ~ 1857 (this is such a huge epoch to train :(( )

Might you let me know that I am correct?

A question about add z_sem add to conditional ddim

Dear diffae team,

Thank you for your great work. I have one question about some implementation detail.
I am wondering how do you add the encoded z_sem into the conditional DDIM?
image
1.Is it mean first feed the z_sem into a linear layer to change its shape then use it to multiply (the bottleneck output+timestep embedding)?

image
2.Or it is concat(output,z_sem) like cGAN do?
3.If it is not the case 2, what are concat in the middle?
I read the code and try to find where you define the model.render function and which part of code does this work but I didn't find it.
I will appreciate if you could kindly let me where is this code.

Best Regards

Inverse a image to latent space and recover it

I have trained a model with my own dataset, and can i do latent inversion in my own dataset.

  1. inverse a image to latent space.

  2. and then reconstruction latent vector to a same image.

anyone can help me, how can i do this.

Thanks.

Best wishes.

Training head rotation classifier

Hi Konpat and team of DiffAE,

I m still trying to train head rotation classifier, using data from FFHQ-Aging(they labeled the headpose).
I am using:

  • 1 layer linear regression from cond to headpose.(512,3)
  • MSEloss with a learning rate of 0.001
  • Training data: 1200 images and headposes.
  • batch size of 128(tried everything from 16 to 512)
  • normalized cond, but not the headpose output. Would you recommend normalizing output as well?
  • accumulate_grad_batches=7
  • gradient_clip_val=0.5
  • 300 epoches

For some reason, loss is keep being stuck around 65, which seems to be too large. Do you have any idea, what I am doing wrong and how I can improve this loss? Would you recommend using more data(perhaps all 70k images)? In your examples, 1k labeled images were enough, so I m not sure increasing the size of training data will help.

Many thanks,
Rich

Training autoenc and Training latent DDM

Might I ask a stupid question that what is the difference between training autoenc and training latent DDM? As far as I understand, I suppose these two are trained at the same time. Can you enlighten me a little bit?

What's the loss to train the semantic-autoenc?

Dear diffae team,

Thank you for your great work. I have one question about some implementation detail.
I'm a little lost in the loss of training ae.
From my understanding , the ae input is a image and output is a vector contains semantic information.
I see the L1 loss in the templates
image
So what's the two item to calculate the loss ? Maybe one is the z_sem produced by encoder, what is the other item?
Or this does not have any loss, just train with a ddim and use the loss of ddim to do backpropagation?

DiffAE without sampling + other questions

Hello, thanks for the great work, it is quite interesting. I have a couple questions

  • If we are not interested in the sampling ability of DiffAE, and merely reconstruction, it is sufficient to just train DiffAE without the latent DPM (commented as 'train the autoenc model' in some of the provided training scripts), correct?
  • I saw in another issue (although I can't find it anymore) you performed some experiments on regularization of z_sem. Did you observe any performance issues besides a less meaningful and interpretable semantic space?
  • I'm a bit confused on how the semantic encoder is trained, is a reconstruction loss simply calculated between the input and output images, and the gradient passed through the U-net and into the semantic encoder?
  • I don't see any problems in doing this, but wondering if you had any comments about training DiffAE within some feature space (image encodings, for example)

Thank you for your input, if you have time constraints the first 2 questions are my largest interests, although I am curious on your thoughts to all :)

generative process backward deterministically to obtain the noise map xT

Thanks for the excellent work.
I am a beginner in diffusion models and have recently come into contact with them. I saw this content in your paper.
With DDIM, it is possible to run the generative process backward deterministically to obtain the noise map xT ,
which represents the latent variable or encoding of a givenimage x0. In this context, DDIM can be thought of as an
image decoder that decodes the latent code xT back to theinput image. This process can yield a very accurate reconstruction; however, xT still does not contain high-levelsemantics as would be expected from a meaningful representation.

I also saw the part about ddim reverse in your code. If I just want to get XT from X0, can a regular ddim do it? Does it require special training or model adjustments?
I would very much appreciate it you could clear up my confusion.

Reconstruct image with only z_{sem}, with x_T is sampled from N (0, I)

Hi, thanks for sharing this nice work.

Could you share some example code for how to reconstruct images by DiffAE when only z_{sem} is encoded from original images but x_T is sampled from N (0, I) for decoding?

It's probably just a small change to the autoencoding.ipynb, but I met some problems when I try to do it.

Thanks a lot.

Missing evaluate_interpolate_fid

In experiment.py in the function intp_fid the evaluate_interpolate_fid method is called which doesn't exist in the publicly available repo, alongside the error that will be thrown because .is_interpolate() is not part of TrainMode class. Guess this was not intended.
Adding def is_interpolate(self): return None does fix the problem of the run error, but putting it out here.

Changing dimension of z_sem

Thanks for open-sourcing this repo!

I wanted to confirm which config parameters need to be changed in order to test different sizes for z_sem in the CelebA experiment. Am I right in assuming I need to change the following 3 lines?

  1. TrainConfig -> net_beatgans_embed_channels
    net_beatgans_embed_channels: int = 512
  1. TrainConfig -> style_ch
    style_ch: int = 512
  1. autoenc_base -> conf.net_beatgans_embed_channels
        conf.net_beatgans_embed_channels = 512

Am I missing any other locations that need to be changed (e.g., for ddpm) or am I changing too many config parameters?

When I tried changing only TrainConfig -> net_beatgans_embed_channels I ran into tensor multiplication dim errors for training the latent DDIM.

torchmetrics version

The pytorch lightning package doesn't require a specific version of torchmetrics,
This causes import errors since torchmetrics moves classes around.

The ligthning version specified in requirements.txt works with torchmetrics==0.6.2

FFHQ1024

Dear DiffAE team,

Thank you for sharing this great work, I really like it.

Have you try to train a model in FFHQ dataset with high resolution (512, 1024)? Do you have plan to release the high resolution checkpoints?

Thank you again for your help.

Best Wishes,

Zongze

Render method is non-differentiable

Hi! Thanks for your amazing work. I found diff-ae is quite interesting and trying to apply it to my research as well. However, I found one of the issues that make it impossible to put an outer optimization loop with the diff-ae sampler as a simple image generator (similar to, e.g., a GAN model). In particular, I'm interested in doing manipulation the attributes with a beta tensor that beta.requires_grad = True, like this:

cond2 = cond * beta + cond_new * (1 - beta)

Note that cond and cond_new are calculated beforehand without gradients. The problem I faced is that when I generate an image from the cond2 which has gradients in this way:

gen_img = model.render(xT, cond2, T=100)

At this step, no gradients can flow back through model.render(). The optimization loop which is supposed to update beta cannot update weights.

So, my question is: why the model.render() cannot be differential, and is there any option that allows gradients to flow back this method?

Image reconstruction problem

Hello, thanks for your code and pre-trained models!
I am trying to run the Manipulate.ipynb code to reconstruct the source image, but it seems that the refactoring is not working very well. Is there something wrong with my code?
Here's my testing code:
image
Here's the results:
image

Total training time

Hello,
Thank you for your work!
However, it was not clear from the paper how much time it took for the model(s) to be trained. Is it in countable (on average) in days or just hours (took less than a day)?

is z_sem can be the latent of VQGAN?

Hi, If I train a vqgan on one dataset, whether its latent code can be thought of semantic codes (ignore the shape of latent code of vqgan and your z_sem are different )?

how to interpolate faces

can you tell me where to find the code to interpolate faces, or show me how to do it, according to the page I use weighted sum
but I'm not very good at math and I don't know what to add here.

Not compatible on A100 or RTX3090

Hi, thanks for your great work!
I run your code on V100 successfully, however, it seems uncompatible while running on A100 or RTX3090. If your team have test cods on any of above hardware, could you please offer me a corrsponding list of environments like requirements.txt?
Thank you a lot!

How to eval the model?

Hi, do you have a plan to release evaluation code?

or can you explain how to eval the model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.