Git Product home page Git Product logo

intrinsic's Issues

Multi-view consistent decomposition

Hi, thanks for the great work.
In my task, I aim to decomposite images from a multi-view image set. First I take images one-by-one as input to your model. Results of two views are presented as follows:
WechatIMG18
WechatIMG19
It's obvious that the albedo images of the roof are not consistent.
Does this work provide multi-view consistency functionality? If not, could you please give me some advice to improve multi-view consistency when decompositing images into albedo and shading?
I sincerely look forward to your reply.

Weights loading issue

Thank you for this great work. I am trying to use both paper weights and rendered weights, but I get this issue:

File "/home/user/Intrinsic/intrinsic/model_util.py", line 40, in load_models
iid_model.load_state_dict(iid_state_dict)
File "/home/user/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for MidasNet_small:
size mismatch for scratch.layer2_rn.weight: copying a param with shape torch.Size([128, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 48, 3, 3]).

Temporal Consistency

I've understood that the paper is intended for use on single images. Hence, applying the work to a video results in flashes and artefacts over the course of the video. I'll attach an example tomorrow. Have you attempted any methods to make your model temporally consistent or any quick fixes that/ideas on how one might approach this problem?

RuntimeError in run_pipeline due to Dimension Mismatch in `permute`

I was trying to run the provided collab and running to the following error.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-9b7c6ff62d88> in <cell line: 2>()
      1 # run the image through the pipeline (use R0 resizing dicussed in the paper)
----> 2 result = run_pipeline(
      3     intrinsic_model,
      4     img,
      5     resize_conf=0.0,

/usr/local/lib/python3.10/dist-packages/intrinsic/pipeline.py in run_pipeline(models, img_arr, output_ordinal, resize_conf, base_size, maintain_size, linear, device, lstsq_p, inputs)
    121     # put all the outputs into a dictionary to return
    122     inv_shd = inv_shd.squeeze(0).detach().cpu().numpy()
--> 123     alb = alb.permute(1, 2, 0).detach().cpu().numpy()
    124 
    125     if maintain_size:

RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3

It appears that there might be an extra dimension in the tensor alb which is not accounted for. Applying .squeeze(0) before the permute operation might resolve this issue by removing any singleton dimension at position 0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.