Git Product home page Git Product logo

Comments (4)

CloseChoice avatar CloseChoice commented on May 18, 2024

Thanks for the report and your effort to investigate this. Your description is absolutely accurate and the reason for this is the default in the tabular masker.

Here is an issue where this problem was already discussed including workaround: #3174.

We probably should throw at least a warning if max_samples < len(X). What do you thing @connortann ? This issue seems to come up and is confusing users.

from shap.

connortann avatar connortann commented on May 18, 2024

I agree with your analysis, this seems to be a consequence of sampling. I'll remove the bug label as I think this is indented behaviour.

We probably should throw at least a warning if max_samples < len(X)

I'm not sure if I agree. To me, warnings are generally used to indicate undesirable situations in which the user should probably update their code to fix the warning. In this case I think for the majority of users the subsampling is expected and desirable behaviour. Many parts of shap are sampling-based and only offer approximate results.

Would log.info() be more appropriate?

from shap.

CloseChoice avatar CloseChoice commented on May 18, 2024

logging.info is fine for me as well. I would be fine with a print as well, just to make sure that users do not have to investigate a couple hours to find the reason for the inconsistency between values and theory

from shap.

connortann avatar connortann commented on May 18, 2024

I would much prefer logging over print statements, as prints are much harder to configure and disable. I think adding a print would risk annoying a large majority of shap users.

I've renamed the title accordingly to reflect the plan.

from shap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.