julia-xai / explainableai.jl Goto Github PK

View Code? Open in Web Editor NEW

101.0 4.0 2.0 40.08 MB

Explainable AI in Julia.

License: MIT License

Julia 100.00%

xai lrp interpretability interpretable-ai feature-attribution julia attribution-methods explainable-ai

explainableai.jl's Issues

Add LRP Composites

Add passthrough rule

Add PassRule that implements Rₖ .= Rₖ₊₁.

Reduce allocations in LRP methods

Instead of having rules modify layer parameters, avoid allocations by implementing modified forward calls that can be diff'ed.
Pre-allocate buffers
- for activations on forward-pass
- for relevances on backward-pass
For analysis of multiple output neurons, only run forward-pass once

This should speed things up a lot!

Update canonizer to support nested Flux Chains

Support for nested Chains was introduced in #119, but canonization currently still requires flattening the model.

Add Aqua.jl tests

https://github.com/JuliaTesting/Aqua.jl

Rename composite primitives

The types of primitives currently end with the word Rule and could be mistaken with LRP rules.

Add wrapper around ShapML.jl

Add a Shapley analyzer when ShapML is loaded. LazyModules.jl could be used for this purpose.

Apply LRP rules via VJP with gradient mapper

Refer to Zennit implementation.

Add LRP support for nested Chains

This will be the first step towards #10 by allowing nested model structures.

This requires the following changes:

new internal representation of rules, e.g. via ~~LRPRulesChain / LRPRulesParallel~~ [Edit: now called ChainTuple and ParallelTuple] to support graphs
treat chains the same way as layers by adding lrp!(Rₖ, r::AbstractLRPRule, c::Chain, aₖ, Rₖ₊₁), which can be called recursively. This will require a different approach to pre-allocating activations and relevances than the one currently used in the call to the analyzer.
Composite might require a refactoring
Update lrp/show.jl

Add package benchmarking tools

Run benchmarks on a representative model, e.g. VGG16.

Add model canonization

Add function canonize(model) which merges BatchNorm layers into Dense and Conv layers with linear activation functions.

Test GPU support

GPU support is currently untested. In theory, GPU tests could be run on CI using the JuliaGPU Buildkite CI.

Locally, a first test of GPU support can be run by modifying the readme example to cast input to a GPU array.
It should be possible to run the following code in a fresh temp-environment:

using CUDA
using ExplainableAI
using Flux
using MLDatasets
using Downloads: download
using BSON: @load

model_url = "https://github.com/adrhill/ExplainableAI.jl/raw/master/docs/src/model.bson"
path = joinpath(@__DIR__, "model.bson")
!isfile(path) && download(model_url, path)
@load "model.bson" model

model = strip_softmax(model)
x, _ = MNIST.testdata(Float32, 10)
input = reshape(x, 28, 28, 1, :)

input_gpu = gpu(input) # cast input to GPU array
analyzer = LRP(model)
expl = analyze(input_gpu, analyzer)

Remove ImageNet preprocessing code

This should be handled via external packages DataAugmentations.jl.

document preprocessing with external packages

Since this is a breaking change, it should be implemented before a 1.0 release.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

LRP rule coverage for Flux layers

This issue keeps track of which Flux layers in the model reference got LRP implementations.

Basic layers

Dense
flatten

Convolution

Conv
DepthwiseConv
ConvTranspose
CrossCor

Pooling layers

General purpose

Normalisation & regularisation

Upsampling layers

Upsample
PixelShuffle

Recurrent layers

RNN
LSTM
GRU
Recur

First release planning

The following should be implemented before registering the package:

Add model checks for LRP

Check model for non-ReLU-like activations and unknown layers.
Display a summary as to why checks failed if they do, as well as references to the docs on how to fix these issues.
Make checks skip-able through LRP kwarg skip_checks=true.

The goal should be to make ExplainabilityMethods transparent but extendable.

Update docs with MNIST example and layer registration

Loading VGG and preprocessing the input currently takes up a lot a space in the docs.

Use a MNIST CNN with MLDatasets to make docs more concise.
Show how to use custom layer functionality from #26.

Update README before 1.0 release

Update README to reflect #157.

Use CIFAR10 examples instead of MNIST

CIFAR10 in the docs and readme should be more interesting than MNIST.

Use AbstractDifferentiation.jl

Replace Zygote dependency with AbstractDifferentiation.jl for backend agnostic AD.

Use multithreading in batch gradient computation

Add the Threads.@threads macro to the for-loop in gradients_wrt_batch:
https://github.com/adrhill/ExplainableAI.jl/blob/f1b89ab9c784ff2a86de9997efa904580b2af6dd/src/gradient.jl#L5-L16

Remove use of `mapreduce`

Using loops over pre-allocated arrays or StackViews.jl should speed up things and help with type stability.

All gradient analyzers currently use mapreduce:
https://github.com/adrhill/ExplainableAI.jl/blob/8641dfb101b63ed5a9876b32988c25e0c6b2191d/src/gradient.jl#L5-L13

Methods using InputAugmentation would also benefit from refactoring:
https://github.com/adrhill/ExplainableAI.jl/blob/8641dfb101b63ed5a9876b32988c25e0c6b2191d/src/input_augmentation.jl#L74-L83

Use DataAugmentation.jl for preprocessing in docs

Refer to quickstart example: https://lorenzoh.github.io/DataAugmentation.jl/dev/docs/literate/quickstart.md.html

Move LRP into separate package

This would remove the dependency on Flux and make the package lighter for users that don't require LRP.

Since this requires a breaking release, this is a milestone for a future 1.0 release.

Refactor results struct

Update field names.
The field layerwise_relevances of the Explanation struct is too specific to LRP.
An extras field of type Union{Nothing, Dict} would be more flexible.

Add `ZPlusRule`

Could be implemented as the one-liner ZPlusRule() = AlphaBetaRule(1.0f0, 0.0f0),
but a large part of the computation could be skipped since β=0.

Add generalized Gamma rule

Test for type stability of analyzers

Document use of LoopVectorization.jl and CUDA.jl

for more performant use of Tullio.jl.

Fix randomness in gradients

Zygote's gradient appears to be non-deterministic on Metalhead's VGG19:

julia> a = gradient((in) -> model(in)[1], imgp)[1];

julia> b = gradient((in) -> model(in)[1], imgp)[1];

julia> isapprox(a, b; atol=1e-3)
false

julia> isapprox(a, b; atol=1e-2)
true

Check whether this is due to Dropout layers or Zygote.

Simplify heatmapping normalizer using ColorSchemes `v3.18`

ColorSchemes 3.18 adds :centered option to ColorSchemes.get that makes _normalize redundant.

Add visualization method for language models

A visualisation method such as plot_text_heatmap from this iNNvestigate notebook would be useful for language tasks such as sentiment analysis.

Add default composites

Add presets similar to those in Zennit, e.g.:
- EpsilonGammaBox
- EpsilonPlus
- EpsilonAlpha2Beta1
- EpsilonPlusFlat
- EpsilonAlpha2Beta1Flat
Update README with examples from presets

Add XAIBase dependency

This would close #79 and be a major milestone towards a 1.0 release.

Dispatch `heatmap` on method that was used to compute attribution

Wrap attribution in struct containing metadata such as used analyzer to dispatch on heatmap.
This struct could also contain the neuron selection and possibly the classifier output.

Specify upper and lower bounds of input in ZBoxRule constructor

Add printing of LRP analyzers, showing layers and rules

Similar to Flux's printing of Chains.

Support `BatchNorm` layers in LRP

and add tests on LRP rules.

Add LRP support for `Parallel` layer

Flux.Parallel docs.

Add composite primitive to assign rule at specific position in model

Using keys obtained by chainkey. A user-friendly version of chainkey must be exported and documented.

Refactor gradient methods to make use of VJPs

For a more efficient implementation of LRP rules.

Regression tests of all methods on VGG19

Metalhead disabled pretrained weights in 0.6.0 due to model inaccuracies.
These can technically still be loaded while they are being fixed:

model = VGG19()
Flux.loadparams!(model.layers, weights("vgg19"))

However, the VGG19 weights are a 548 MB download every time CI is run. It might therefore be more reasonable to use a smaller model. Currently, MetalheadWeights contains (in ascending size):

SqueezeNet (5 MB) -> requires Parallel for "fire" modules
GoogLeNet (27 MB) -> requires Parallel
Densenet121 (31 MB) -> requires SkipConnection
ResNet-50 (98 MB) -> requires Parallel, skip_identity
VGG-19 (548 MB)

An easy workaround would be to run the methods on randomly initialized parameters (with fixed seed). The explanations w.r.t. to this model should still stay constant.

Add docs

Add doc pages for:

Basic example on VGG model
Combining LRP rules
Custom LRP rules

Also:

Add images to readme

Update documentation for `v0.6.0` release

Currently, the following things can be improved or are missing documentation:

input augmentations: NoiseAugmentation, InterpolationAugmentation
usage of LayerMap and show_layer_indices introduced in #131
LRP keyword flatten and performance benefits
LRP model canonization
Update section "How it works internally" for #119
Update "Model checks for humans" for #119
~~Use DocumenterCitations.jl~~
#64