Git Product home page Git Product logo

uorf's People

Contributors

kovenyu avatar nepfaff avatar snavely avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

uorf's Issues

Issue with Multiple GPUs

Dear authors,

I saw your config supports adding multiple gpus with "--gpu_ids", however the implementation seems doesn't support it. Have you tested training with multiple gpus as well? Thanks

Information about dataloading

Hello,
I am trying to apply uORF to my own dataset, which currently consists only of multiview RGB images.
I noticed in the dataloading code there are many additional files to load aside from the images. It is not clear from the paper which inputs are necessary and what are not.

Could you elaborate on all the different files being loaded in the dataloading code? Like if they are necessary for training the model, if they are for evaluation or debugging, and if they are very useful or not for debugging. This will help me greatly with figuring out if I should generate the corresponding metadata with my custom dataset, or just not use it.

https://github.com/KovenYu/uORF/blob/main/data/multiscenes_dataset.py

Download Issues of Data and Pretrained Models.

Thanks for your great work.
The download link you provide in README needs a Stanford SharePoint account to log in, which makes us cannot access outside. I try to use a personal account or my institution account to access it, but it didn't work.

Could you please solve this or upload it to google drive? Thanks.

Perceptual loss is zero

While using the ./train_clevr_567.sh to train the model on the clever 567 scene, the perceptual loss is zero. Is that the usual behaviour or is it due to something elsle ?
Moreover when i modify the shell script to be the trained on uorf_gan model and train_with_gan.py then this is the following output
image

Reconstruction loss is non zero, otherwise all other losses are zero for the first epoch. Is this fine ?

What is nss_scale ?

A kind request to please clarify regarding nss_scale ?
Like I want to generate my own data and train the model on it. How should I determine the nss_scale ?

Lambda Perceptual

Hi,

I am wondering about why the weight for the perceptual loss is only 0.006. What happens when it is higher?

Thanks!

Why are the normalised pixel co-ordinates subtracted from 2 ?

pixel_emb = torch.stack([x1_m, x2_m, y1_m, y2_m]).to(x.device).unsqueeze(0) # 1x4xHxW

here like from paper i get that it is for getting information in 4 directions ?
but what are the 4 directions being talked about ? is this like front and back of the camera? but here as we are rendering foreground objects in viewers view point we only need information in the front of the camera?

Question about the groundtruth segmentation mask

Hi, Koven, I wonder if there are the ground truth segmentation masks provided in the dataset. By the way, I want to know how can generate the segmentation labels when generating my own dataset.

Setting of the Scene Design and Editing Experiments.

Hi, thanks for your great work.
I'm trying to reproduce your scene editing baseline. I'm wondering how to get the object moving results in the paper? It's performed by editing slot features before sending them to the decoder or directly manipulating the NeRF volume before the composition.

Any tips for real scene data?

Thanks for your nice work. Is it possible to fit this method into real scene captured by my phone?
E.g., I have captured some photos, and then run COLMAP to get camera pose and other parameters. Finally, I put these in uORF.
Can I get acceptable object radiance fields? Did you have a try? Or do you have any tips?

Question about segmentation and manipulation

Hi, thanks for the efforts and the work is interesting. I have two questions here:

  1. Is it the right way to predict segmentations using argmax(integrate(weight * density), slot_dim)? Seems that it could be wrong when occlusion occurs. I am showing a rendered example (the segm on top and the image on bottom) as below. I'm referring to the two cylinders on the right.

00000_sc0000_az00_render_mask3
00000_sc0000_az00_x_rec3

  1. How can I manipulate clever scenes?

FileNotFoundError: [Errno 2] No such file or directory: './clevr_567_test/00000_sc0000_az00_moved.png'

I'm having this question because a background slot view contains object shadows as shown below. So how will these shadows be during manipulations?

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.