Git Product home page Git Product logo

Comments (3)

ChaoningZhang avatar ChaoningZhang commented on July 25, 2024 5

Thanks for the excellent work.

The whole image segmentation is much slower than the FastSAM, is this because of the different postprocessing? Thanks

Thanks for your interest in our work. Note that MobileSAM makes the image encoder lightweight without changing the decoder (like 8ms on the encoder and 4ms on the decoder). Since we mainly target the anything mode (1 times image encoder and 1 times decoder) instead of everything mode (1 times image encoder and 32x32 times decoder), see the paper for definition difference (Anything mode is the foundation task while everything mode is just a downstream task as indicated in the original SAM paper). "segmentation for whole image seems to suggest that you are using everything mode. For everything mode, even though our encoder is much faster than that of the original SAM(roughly 8ms vs 450ms), it cannot save too much time for the whole pipeline since most of the time is spent on the 32x32 times decoder. One way to mitigate this is to use smaller number of grids (like 32x32 or 88) to make the decoder consume less time, since many redundant masks are generated in the case of 32x32 grids. I hope this addresses your issues, otherwise, please kindly let us know. We are also currently trying to make the image decoder more lightweight by distilling it with smaller one as we did for image encoder. Stayed tuned for our progress. FastSAM deviates from the proptable segmentation task that the original SAM solves, by removing the prompt-guided mask decoder and directly generating all masks regardless of the prompts. Since the original SAM and our MobileSAM is not optimally designed for saving time for the everything mode, it can take longer time. As I said before, you can reduce time significantly by setting the grid to 88 instead of 32x32, which will still give you reasonable results but significantly improve the speed.

from mobilesam.

ChaoningZhang avatar ChaoningZhang commented on July 25, 2024

If you do not have more issues, I will close it for now

from mobilesam.

fanlinfuture avatar fanlinfuture commented on July 25, 2024

thanks for the reply. That sounds reasonable.

from mobilesam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.