kovenyu / uorf Goto Github PK

Unsupervised Discovery of Object Radiance Fields

License: MIT License

Python 92.13% C++ 1.00% Cuda 5.50% Shell 1.38%

uorf's Issues

Information about dataloading

Hello,
I am trying to apply uORF to my own dataset, which currently consists only of multiview RGB images.
I noticed in the dataloading code there are many additional files to load aside from the images. It is not clear from the paper which inputs are necessary and what are not.

Could you elaborate on all the different files being loaded in the dataloading code? Like if they are necessary for training the model, if they are for evaluation or debugging, and if they are very useful or not for debugging. This will help me greatly with figuring out if I should generate the corresponding metadata with my custom dataset, or just not use it.

https://github.com/KovenYu/uORF/blob/main/data/multiscenes_dataset.py

The acquisition of Room Chair dataset

Hey Koven,

Thanks for your great work! May I ask how do you render those room scene with chairs images? Thanks !

Best!

Why are the normalised pixel co-ordinates subtracted from 2 ?

uORF/models/model.py

Line 48 in d504719

 pixel_emb = torch.stack([x1_m, x2_m, y1_m, y2_m]).to(x.device).unsqueeze(0) # 1x4xHxW 

here like from paper i get that it is for getting information in 4 directions ?
but what are the 4 directions being talked about ? is this like front and back of the camera? but here as we are rendering foreground objects in viewers view point we only need information in the front of the camera?

input size and supervision size

What do the parameters input size and supervision size mean in the code ?

Download Issues of Data and Pretrained Models.

Thanks for your great work.
The download link you provide in README needs a Stanford SharePoint account to log in, which makes us cannot access outside. I try to use a personal account or my institution account to access it, but it didn't work.

Could you please solve this or upload it to google drive? Thanks.

Perceptual loss is zero

While using the ./train_clevr_567.sh to train the model on the clever 567 scene, the perceptual loss is zero. Is that the usual behaviour or is it due to something elsle ?
Moreover when i modify the shell script to be the trained on uorf_gan model and train_with_gan.py then this is the following output

Reconstruction loss is non zero, otherwise all other losses are zero for the first epoch. Is this fine ?

Question about segmentation and manipulation

Hi, thanks for the efforts and the work is interesting. I have two questions here:

Is it the right way to predict segmentations using argmax(integrate(weight * density), slot_dim)? Seems that it could be wrong when occlusion occurs. I am showing a rendered example (the segm on top and the image on bottom) as below. I'm referring to the two cylinders on the right.

How can I manipulate clever scenes?

FileNotFoundError: [Errno 2] No such file or directory: './clevr_567_test/00000_sc0000_az00_moved.png'

I'm having this question because a background slot view contains object shadows as shown below. So how will these shadows be during manipulations?

What is nss_scale ?

A kind request to please clarify regarding nss_scale ?
Like I want to generate my own data and train the model on it. How should I determine the nss_scale ?

Issue with Multiple GPUs

Dear authors,

I saw your config supports adding multiple gpus with "--gpu_ids", however the implementation seems doesn't support it. Have you tested training with multiple gpus as well? Thanks

Any tips for real scene data?

Thanks for your nice work. Is it possible to fit this method into real scene captured by my phone?
E.g., I have captured some photos, and then run COLMAP to get camera pose and other parameters. Finally, I put these in uORF.
Can I get acceptable object radiance fields? Did you have a try? Or do you have any tips?

Applying R1 Gradient Penalty

Do you apply the R1 gradient penalty only every 32 iterations?

Thanks

Question about the groundtruth segmentation mask

Hi, Koven, I wonder if there are the ground truth segmentation masks provided in the dataset. By the way, I want to know how can generate the segmentation labels when generating my own dataset.

Setting of the Scene Design and Editing Experiments.

Hi, thanks for your great work.
I'm trying to reproduce your scene editing baseline. I'm wondering how to get the object moving results in the paper? It's performed by editing slot features before sending them to the decoder or directly manipulating the NeRF volume before the composition.

Lambda Perceptual

Hi,

I am wondering about why the weight for the perceptual loss is only 0.006. What happens when it is higher?

Thanks!

kovenyu / uorf Goto Github PK

uorf's Issues

Information about dataloading

The acquisition of Room Chair dataset

Why are the normalised pixel co-ordinates subtracted from 2 ?

input size and supervision size

Download Issues of Data and Pretrained Models.

Perceptual loss is zero

Question about segmentation and manipulation

What is nss_scale ?

Issue with Multiple GPUs

Any tips for real scene data?

Applying R1 Gradient Penalty

Question about the groundtruth segmentation mask

Setting of the Scene Design and Editing Experiments.

Lambda Perceptual

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent