rover-xingyu / ha-nerf Goto Github PK

View Code? Open in Web Editor NEW

146.0 146.0 13.0 33.7 MB

[CVPR 2022] Ha-NeRF😆: Hallucinated Neural Radiance Fields in the Wild

Python 100.00%

nerf

ha-nerf's People

Contributors

Stargazers

Watchers

Forkers

gatsby23 haotongl aadityaverma mirlansmind aurora-chevalier dfqytcom jackzhousz 2j1ejyu kjunsong hana-bez xavaki qianqian121

ha-nerf's Issues

pytorch lightning error

Hi，why I set this parameter to 4 cause this issue.

Ablation experiment

Xingyu, can you provide me with pictures of three datasets for ablation experiments (A and T)? I failed the train ablation experiment, thank you very much!

cache/img_ids.pkl

If I use the parameter -- use_ Cache, but where to get the. pkl file

Hi，xingyu！
In the calculation of loss,
ret ['mode_seeking ']=hparams. weightMS * 1/
((torch.mean(torch.abs(inputs['rgb_fine'].detach() - inputs['rgb_fine_random'])) /
Torch.mean (torch. abs (inputs ['a_embedded ']. detach() - inputs ['a_embedded_random']. detach())+1 * 1e-5),
the denominator here is 0,（Because inputs ['a_embedded ']. detach() and inputs ['a_embedded_random'] are the same） and loss will result in nan. I don't quite understand, can you explain?

Running on own dataset

Hi!

We are trying to run this project on our own dataset. In this issue you mention that the way to do this is to use the LLFF poses_bounds.npy format but it seems that the support for this is not available? Could you perhaps provide a way to achieve this?

tsv file

There are only 6 tsv files. If I want to train files other than these 6 tsv files, what should I do? Do you want to manually create a tsv file?

multiple gpu train

If you use multiple graphics cards on a computer to train in parallel, how to set them up in Python lighting

image-dependent transient embedding ‘lτi

hi，xingyu，How is transient embedding obtained,，Is it the following code：
self.embedding_view = torch.nn.Embedding(hparams.N_vocab, 128)，
I don't understand. Please explain it. Thank you！

Time for opening source code.

It's been four months and your code has not been open sourced.

How do you compare different methods

Thanks for your excellent work.
I have a question about comparison with other methods.

Some works (for example, NeRF--) set the same training epoch (e.g., epoch=20). Then, they compare different methods using the performance of the trained checkpoint of epoch 20.

Other works save the checkpoint of the best validation performance and compare it with the best performance of different methods.

Which one does Ha-NeRF choose for validation?
Thanks for your time!

padding size and lightningmodule error

I use the cache to run my code,but find two errors. I can not figure out what's the mistake,I doubt this is due to the cache produced by yor code:prepare_phototourism.py .... Besides,I use the brandenburg_gate dataset which I have successfully run on the nerf-w code. Could you please help me?

Sir，how to get the transient embedding？It was not mentioned in the paper 😊

Some mask confusion

hello,

I would like to ask why we got the mask in the network training, but we did not use the mask in the test code and got the scene without occlusion
You just input a 2D image to get the mask, can we use it in other image inpainting fields? We want to pay attention to the loss calculation on the environment instead of moving objects, but the output image has occluded objects, does this make it impossible to train the mask?

Thanks.

json file

How to get a json file

cannot find the train_val split

I cannot find the train_val split through the link that you listed. Could you upload this? Thanks.

How to input the parameter of " the example_image "when hallucinating

When I do hallucination, I don't know how to enter this parameter, e.g. I use brandenburg_gate dataset, but I don't know what different scenes {example_image} refers to, is it the test set I set,？Now the result is wrong。

Could you help me figure out this problem?

--use_cache

--use_cache这个参数是干嘛的？pkl文件在哪里可以获取？

the inputs of transient and appearance embedding

Hello, I would like to know the input of transient_embedding and appearance_embedding. Thank you

About view consistency loss

Thanks for your inspiring work.

I have a question about figure 3, in which you render I^{r} _{i} based on camera ray and appearance feature of I _{i}.

I believe that the image I^{i} for generating the appearance feature, and the ground truth image for supervising rendered image I^{r} _{i} should be from the same image domain in which images share the same appearance feature.

However, how to identify the image domain in the given datasets?

I checked the brandenburg_gate dataset, the informations are image name, id, split, and name of dataset. I wonder which information tells the set of images is from the same image domain?

About Data Preprocessing

I Wonder why you use normalize for 2 times?
thanks!

Sample points for train and test

Hi, as the originial NeRF-W said, "During training, 512 points per ray are sampled from each of the coarse and fine models for a total of 1024 points per ray. We double that number during evaluation to 2048."

But the settings are 64(train) and 256(test) in your code. May I ask why there is a difference ?

sparse file stereofile

Hi，what is the function of the sparse file stereofile in the dataset folder？

Why use the first image of train set for validation during the training

https://github.com/rover-xingyu/Ha-NeRF/blob/main/datasets/phototourism_mask_grid_sample.py#L214

Why doesn't use the “self.val_id = self.img_ids_test[0]”?

How to reproduce the video effects of your project demo.

Hello,
Would you mind teaching me the way of reproducing the effects of your demo on project webpage:

How to train a model with your own dataset?

Thanks for sharing the superb code, but I wonder how can I use Ha-NeRF to train a model with my own dataset?

Synthetic lego dataset PSNR only 19

Hi~thanks for your excellent work.
But when I try to test Ha-NeRF on blender lego dataset, I find the PSNR only reaches 19, but nerf-w PSNR reaches 28.
I'm not sure if the code running is wrong, and here is my code:
python train_mask_grid_sample.py --root_dir ../datasets/nerf/nerf_synthetic/lego/ --dataset_name blender --save_dir save --img_wh 400 400 --N_importance 64 --N_samples 64 --num_epochs 20 --batch_size 4096 --optimizer adam --lr 5e-4 --lr_scheduler cosine --exp_name exp_lego --N_emb_xyz 15 --N_vocab 100 --maskrs_max 5e-2 --maskrs_min 6e-3 --maskrs_k 1e-3 --maskrd 0 --encode_a --N_a 48 --weightKL 1e-5 --encode_random --weightRecA 1e-3 --weightMS 1e-6 --num_gpus 4 --data_perturb color
and the results are shown below:

Hope for your reply!!!

test

As for testing, for example, 10 images were used to test the Brandenburg Gate model. Why are these 10 tested images almost free of occlusion? Can I use more occluded images to test?
The images obtained from the test are mainly used to remove the occlusion and better learn the light and shadow? Is there any mistake in my understanding?

colmap

hi，In the three bin files extracted from colmap, have you changed the default parameters of colmap?

The problem of building dataset

I don't know how to get the depth_maps_clean_300_th_0.10 in the stereo , anyone gets the idea?
Thank you for telling us.

build my own data

I want to know how to use this code to build my own data set. Has anyone built their own data set implementation?

train time

Hello, how long does it take you to train a model (such as brandenburg_gate), and what hardware configuration do you use for training?

encode_random

Thanks for your great work.

I do not understand why you use 'encode_random' and what is the purpose for you to design this block in the NeRF.

Thanks.

Loss parameters

parser.add_argument('--maskrs_max', type=float, default=0.0000015,
                    help='regularize mask size')# 
parser.add_argument('--maskrs_min', type=float, default=0.0000015,
                    help='regularize mask size')# 
parser.add_argument('--maskrs_k', type=float, default=0.9,
                    help='regularize mask size')# 
parser.add_argument('--maskrd', type=float, default=0.001,
                    help='regularize mask digit')
parser.add_argument('--weightKL', type=float, default=0.0000000000000001,
                    help='regularize encA')
parser.add_argument('--weightRecA', type=float, default=0.0000000001,
                    help='Rec A')
parser.add_argument('--weightMS', type=float, default=0.0000000001,
                    help='mode seeking')

The parameters you provided for the example: --maskrs_max 5e-2 --maskrs_min 6e-3 --maskrs_k 1e-3 --maskrd 0 --weightKL 1e-5