Git Product home page Git Product logo

graf's Introduction

Autonomous Vision Blog

This is the blog of the Autonomous Vision Group at MPI-IS Tübingen and University of Tübingen. You can visit our blog at https://autonomousvision.github.io. Also check out our website to learn more about our research.

Overview

Creating a blog post follows the usual git workflow:

  1. clone repository:

    git clone https://github.com/autonomousvision/autonomousvision.github.io.git
    
  2. create new branch for your post:

    git branch my-post
    git checkout my-post
    
  3. work on branch / push my-post branch for collaboration

  4. rebase master on your branch and squash commits (note that all your commits to master will be visible in the git history):

    git checkout master
    git rebase -i my-post
    
  5. push master

    git push origin master
    
  6. delete your branch

    locally:

    git branch -d my-post
    

    and remotely if you pushed your branch in step 3:

    git push origin --delete my-post
    

Instructions for Authors

To write a new blog entry, first register yourself as an author in authors.yml. Here, you can also add your email address and links to your social media accounts etc.

You can then create a new blog post by adding a markdown or html file in the _posts folder. Please use the format YYYY-MM-DD-YOUR_TITLE.{md,html} for naming the file. You can then create a yaml header where you specify the author, the category of the post, tags, etc. For more information, take a look at existing posts and the Minimal Mistakes documentation.

If you want to include images or other assets, create a subfolder in the assets/posts folder with the same name as the filename of your blog post (without extension). You can simply reference your assets in your post using {{ site.url }}/assets/posts/YYYY-MM-DD-YOUR_TITLE/ followed by the filename of the corresponding asset. Make sure that you don't forget to include the {{ site.url }}! While the post while be rendered correctly without the {{ site.url }}, the images in the newsfeed will break if you don't include it.

Please keep in mind that all your commits to master will appear in the git history. To keep this history clean, it might make sense to edit your post in a separate (private) branch and then merge this branch into master.

Offline editing

When you do offline editing, you probably want to build the website offline for a preview. To this end, you first have to install Ruby and Jekyll. Then, you have to install the dependencies (called Gems) for the website:

bundle

Now, you are ready to build and serve the website using

 bundle exec jekyll serve

Sometimes Jekyll hiccups over character encoding. In this case, try

 LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 bundle exec jekyll serve

If you encounter GemNotFoundException, try to remove

BUNDLED WITH
    2.0.1

from Gemfile.lock.

This command will build the website and serve it at http://localhost:4000. When you save changes, the website will be automatically rebuilt in the background. Note, however, that changes to _config.yaml will not be tracked which means that you have to restart the jekyll server after configuration changes.

References

You can find more information here:

graf's People

Contributors

katjaschwarz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graf's Issues

About unposed images,latent codes and modify shape and appearance

Hello,thank you for sharing. I'm a complete novice but I read your paper carefully. And there are some questions I cannot understand and unsuccessfully find in the paper :

no1. The paper mentions:“training our model from unposed 2D images alone” ,but the generator does need "camera pose" as a input. So I'm confused What are "unposed images"? What does' unpose 'mean?

no2. The generator needs two latent codes(za,zs), so how to get them? Are they part of parameter of netwok which need to training to optimize?

no.3The paper mentions:“approach allows to modify shape and appearance of the generated objects”,but I still cannot understand "how to modify" after reading,So how we to control the latent codes of za,zs?

Thats all , looking forward to your or someone else's reply.Thanks!

ArgumentError

parser.add_argument("--N_samples", type=int, default=32*32*4, help='batch size (number of random rays per gradient step)')

The argument is duplicated, and causes an error
It should be
parser.add_argument ("- N_rand", type = int, default = 32 * 32 * 4, help = 'batch size (number of random rays per gradient step)')

quention about paper

Hi,Thanks for your great work, can you explain for the detail about 'It is further important to note that we do not downsample the real image I based on s, but instead query I at sparse locations to retain high-frequency details, see Fig. 3.'?
As far as i am concerned, i think the bilinear sampling operation is just downsampling , but i down konw what exactly your means "query" in this sentence.

Different Raysampler between train and val/test

I notice that in graf/transforms.py,
during train, class FlexGridRaySampler(RaySampler) is used
but during val/test, class FullRaySampler(RaySampler) is used

I wonder why use different raysampler?
what is the principle behind that opreation choice?

carla camera pose generate script?

Hello, your work is awesome. I would like to know some scripts generated from the Carla dataset. I like to create my own Carla dataset for nerf training, but I am currently struggling with details. Can you provide a script for generating carla camera pose?

There are many different categories and labels in a picture.

This is a great project. I want to use giraffe to generate data of different shapes or specific conditions (such as darker scenes) from my dataset. But my custom dataset is very similar to COCO dataset. There are many different categories and labels in a picture. I also hope to be able to generate higher resolution images such as 640x640x3. Please suggest what I need to pay attention to.

A question about shape/appearance codes

When I tried to analyze how the Shape/APPEARANCE CODES was generated, I found that there was no obvious part in the code. In the Forward code of NERF, there seemed to be no APPEARANCE CODE, and the shape code was the input coordinate?This is strange. What should be like?

setting u and v for the camera poses

Hello! I would like to appreciate sharing your great work, it is indeed a wonderful work!

1)May I ask following question regarding the code of GRAF? I know that we control the camera poses by setting the min and max value of u and v. But I would like to know how exactly the camera poses (rotation, elevation) are calculated using u and v from the following codes.

u = azimuth / 360
v = 0.5* (1-cos(polar * pi/180))

I would greatly appreciate if you could explain this to me!

2)Also, if I have the dataset that has the image of a human face from every degree (0-360) then should I reset the u and v accordingly for the training? In that case, may I ask how could I set the u and v?

Thank you so much!

How to generate images of certain categories?

Since GRAF is not conditioned on any label, during training time do the latent shape and appearance vectors vary from image to image rather than category from category? If so, how to generate a certain type of image (eg. sofa instead of chair)?

Thanks!

Question about training time and gpu

Thank you for your interesting works.
I'm not familiar with this field, neural rendering, so I have no idea of the training time and required memory.
Could you let me know the training time and which GPU did you use for training?

Real Patch and Generated Patch

Hi,
Are the real patch and generated patch on the same angle at the discriminator? if yes which factor influence that? Otherwise, how does the discriminator compare the two?
Thank you!

about chamfer distance

Hi,
Thank you for sharing your nice work.
I have a question about one of the evaluation metrics.
Is it possible to compute chamfer distance using this code?
Thanks in advance.

About input of network

I have a question. During evaluation, you network do not receive the input image. Maybe your method needs to train a model for every single object? How do you rotate the image without the knowledge of original image?

Thank you.

How much is the required GPU memory at least?

Hi, thanks for releasing the codes. It is really interesting work. I am trying to train a model from scratch on CUB dataset. After finishing the preparation, I run the following command
CUDA_VISIBLE_DEVICES=0 python train.py configs/cub.yaml
using single GPU. However, After this log
[cub_64 epoch 7, it 7990, t 1.099] g_loss = 1.0365, d_loss = 1.1345, reg=0.0230,
I got
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 11.91 GiB total capacity; 10.89 GiB already allocated; 1.06 MiB free; 11.29 GiB reserved in total by PyTorch)

I am using TITAN X (Pascal) with 12196MiB available memory. I wonder how much is the minimum GPU memory that can run the training. Or do you have any instructions for adjusting hyperparameters to decrease memory consumption?

Thanks!

Question about fine vs coarse raysampling

Hey,
My impression from reading the paper was that the coarse then fine raysampling steps are performed on the same implicit function in order to sample in areas of higher alpha density for this function. However, my understanding from reading the code is that generator and generator fine have 2 separate implicit functions with different parameters. Could you clarify which of these is correct? As far as I understand, only the output from generator fine is used to compute the loss.

Error with CUDA 11.1

I ran train.py with celebA. Used the conda env, with CUDA 11.1, and got this error:
Traceback (most recent call last): File "train.py", line 139, in <module> x_real = get_nsamples(train_loader, ntest) File "/home/ed/Documents/repos/graf/graf/utils.py", line 11, in get_nsamples x_next = next(iter(data_loader)) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__ data = self._next_data() File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data return self._process_data(data) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data data.reraise() File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop data = fetcher.fetch(index) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/ed/Documents/repos/graf/graf/datasets.py", line 41, in __getitem__ img = self.transform(img) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 67, in __call__ img = t(img) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torchvision/transforms/transforms.py", line 615, in forward if torch.rand(1) < self.p: File "/home/ed/.conda/envs/graf/lib/python3.8/site-packages/torch/cuda/__init__.py", line 163, in _lazy_init raise RuntimeError( RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

How did you get the pose

Hi there. I read in your paper that you can "learn a 3D-aware generative model from unposed 2D images." But in section 3.2.1 you also mentioned that "We sample the camera pose ξ = [R|t] from a pose distribution p ξ ." So I am wondering what's the pose in fact. How to get pose from the unposed images?

Rendering angles

Hello, I was training the model on my data with

  umax: 1.0 
  umin: 0 
  vmax: 0.45642212862617093 
  vmin: 0.32898992833716556

but then I am trying to render the images/videos from a different angle and it is not really working:

  umax: 0.04166666666666667
  umin: 0.
  vmax: 1.
  vmin: 0.  

I am kind of trying to get the result with almost no rotation in the azimuth angle, but with half-rotation for the polar angle. Do I need to actually retrain the whole model for this because the difference in the angles is too big? I tried something similar with NeRF before and the rotation actually worked, only that I was getting just noise in the area outside of the object.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.