Git Product home page Git Product logo

muggled_dpt's People

Contributors

heyoeyo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

muggled_dpt's Issues

Any consideration to support the Metric Depth models of DepthAnything?

Hi
Ill like to first thank you for the work you did. Was struggling to get proper live-camera inference to work with sufficient fps but your work made it much faster.
I had successfully tired out using the VITSmall model but when i wanted to attempt using the Indoor_Metric_Depth model (from here) i get this error: "NotImplementedError: Bad model type: unknown, no support for this yet!"

Is there work being done to add support to the metric depth models?

Size of depth .npy file

Hi,

Thanks for your amazing work.

I have one question. I am trying the fusion scaling demo on a 640x480 image. However, the saved depth .npy file tells that its size is 504x364.

Is this because the image_preprocessor of the DPT to fit a 14px patch?

If so, how do we get the depth matching the raw size.

Thanks for your help in advance.

confusion about network architecture for depth anything v2?

Thank you for your awesome project!
But, I am confused about which you write comments to extract features from the last 4 blocks in depth_anything_v2 architecture.

Note that unlike the original DPT models, this model does not output intermediate tokens
from the transformers, instead it outputs the last 4 (consecutive) blocks!

However, in the code implementation, you average the features from the middle not last(depth anything v1 is like this). Is this an issue to fix the comment?
num_stages= 4
layers_per_stage = int(round(num_blocks / num_stages))
stages_list= []
for _ in range(num_stages):
one_stage = TransformerStage(layers_per_stage, features_per_token, num_heads)
stages_list.append(one_stage)
self.stages = nn.ModuleList(stages_list)

And, this feature extraction method is better or not than output intermediate tokens from the transformers like the original DPT model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.