Problem Deion We are currently making a huge effort to get t

ENH: Run Notebooks in CI pipeline about shap HOT 2 CLOSED

CloseChoice commented on June 8, 2024 2

ENH: Run Notebooks in CI pipeline

from shap.

Comments (2)

CloseChoice commented on June 8, 2024 2

Thanks for your effort on this! I think this is a good idea in principle.

My initial thought was that we could use nbsphinx, which supports executing notebooks out-of-the-box: notebooks are executed as part of the build if the outputs are empty. However, this has some problems:
1. Unintentional changes to docs might creep in without human review.

2. Compute time: training ML models, computing shap values.

3. Flaky builds. The build would fail as soon as any 3rd party lib used in the docs has a breaking change, even if the main shap test suite passes.
So your suggestion of checking the notebooks with a CI job seems like a good shout. That mitigates (1), as we wouldn't be overwriting the notebooks automatically.

To mitigate (2), an allowlist seems sensible. As a benchmark, we could aim to keep this new CI job running faster than the pytest suite on CI, so it's not a bottleneck.

So that leaves flaky builds (3). It's probably a good thing that the main pytest jobs use the very latest versions of dependencies; but the downside is flakiness. It's not trivial to pin dependencies with setuptools in a manner that works on multiple python versions and OSs. We could perhaps try to pin the dependencies in the notebooks job, whilst leaving the main pytest depenencies open? Open to ideas on this one.

Thanks for your thoughtful feedback.
Regarding (3). If things break this might also be an information for us. I would suggest giving it a shot without pinning versions and if we run into problems with failing pipelines we have the options to:

merge anyway, as long as just the notebooks job crashed. This is a temporary option if a fix needs to be deployed
deactivate the pipeline for some time and pin the versions.
I see where you are coming from and it is certainly a valid concern but I would try to keep the work burden on these things minimal if we have no idea how bad the problem will be. Once we have more data on this I'll be the first to get my hands dirty and implement a solution.

from shap.

connortann commented on June 8, 2024

Thanks for your effort on this! I think this is a good idea in principle.

My initial thought was that we could use nbsphinx, which supports executing notebooks out-of-the-box: notebooks are executed as part of the build if the outputs are empty. However, this has some problems:

Unintentional changes to docs might creep in without human review.
Compute time: training ML models, computing shap values.
Flaky builds. The build would fail as soon as any 3rd party lib used in the docs has a breaking change, even if the main shap test suite passes.

So your suggestion of checking the notebooks with a CI job seems like a good shout. That mitigates (1), as we wouldn't be overwriting the notebooks automatically.

To mitigate (2), an allowlist seems sensible. As a benchmark, we could aim to keep this new CI job running faster than the pytest suite on CI, so it's not a bottleneck.

So that leaves flaky builds (3). It's probably a good thing that the main pytest jobs use the very latest versions of dependencies; but the downside is flakiness. It's not trivial to pin dependencies with setuptools in a manner that works on multiple python versions and OSs. We could perhaps try to pin the dependencies in the notebooks job, whilst leaving the main pytest depenencies open? Open to ideas on this one.

from shap.

ENH: Run Notebooks in CI pipeline about shap HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent