Git Product home page Git Product logo

fitk's People

Contributors

jcgoran avatar

Stargazers

 avatar

Watchers

 avatar  avatar

fitk's Issues

Add numpy type hints

As of version 1.21, numpy has support for type hints (see numpy.typing docs, notably numpy.typing.NDArray), so it would perhaps be useful to add type hints in the code and bump the minimum version of numpy to that.

Fix shading of contours in `FisherFigure`s

When calling the plot method, the contours are plotted in the order of increasing uncertainty. Unfortunately, this means that if one plots multiple contour levels with opacities alpha1 and alpha2, the first contour will not actually have opacity alpha1, but rather a combination of alpha1 and alpha2.

The solution

  • plot the contours in reverse order
  • after plotting each contour, plot the previous (i.e. smaller) contour, with a color matching the color of the background (assuming it's not transparent) with 100% opacity

The above works because the second step effectively makes a contour with a "hole" in it, so the previous (i.e. smaller) contour retains the correct opacity passed to it.

Add separate contouring for 1D/2D contour levels

Add kwargs like contour_levels_1d and contour_levels_2d to the class which generates 2D contour plots (currently FisherFigure2D), so they can be specified separately.
If specified, they should override the current contour_levels, but if not, those should be used (for instance, setting both contour_levels_1d and contour_levels, it should plot the 1D contours using values from contour_levels_1d, and the 2D contours using values from contour_levels).

Add GUI

Use some cross-platform graphical toolkit to make a GUI version of FITK (optional, should be installable via pip install fitk[gui] or something).

Order of parameters matters when using a `FisherBarFigure`

The order of the parameters should not matter when plotting the matrices using FisherBarFigure, and the parameters in the other matrices should all be automatically sorted according to the order of parameters in the first one.

Example

from fitk import FisherBarFigure, FisherMatrix

fm1 = FisherMatrix([[1, 0], [0, 3]], names=["a", "b"])
fm2 = FisherMatrix([[3.1, 0], [0, 0.9]], names=["b", "a"])
ff = FisherBarFigure()
ff.plot_relative_constraints([fm1, fm2], kind="bar")

Actual output

wrong

Expected output

right

Only upload coverage from one workflow

The coverage report should only be uploaded if the underlying OS is ubuntu-latest, and the Python version is 3.9 (arbitrary, but still supported until 2025).

Add ability to mark fiducial on the 1D/2D contour plots

Would be nice to have a kwarg like mark_fiducials to mark the fiducials on the 1D/2D contour plots.
It should have a default of None, and if one specifies True, it should use some default plotting style, while if one specifies a dictionary with kwargs compatible with vlines or hlines, it should use that style instead.

A possible ambiguity: if one specifies mark_fiducials={}, do we plot it or not? I think we should, since the user intentionally specified it. In other words, the check should be if mark_fiducials is None, so that only in the case that mark_fiducials=None, will nothing be plotted (as opposed to if not mark_fiducials, which is falsy and would not catch the empty dictionary since that's falsy as well).

Use a proper JSON class

Instead of calling jsonify to convert numpy arrays to JSON strings, make a proper subclass as described here.

Change z-order of marked fiducials

Unless specified otherwise, the z-order of the marked fiducials should be at the very top of the graph, to make it easily noticeable.

Reverse order of elements in `FisherBarFigure`

Problem

When calling plot_absolute_constraints or plot_relative_constraints with kind='barh', the elements of the Fisher matrices are plotted from bottom to top. Furthermore, when calling legend on it, the labels in the legend are displayed top-to-bottom, while the plots themselves are displayed bottom-to-top.

Example

ff = FisherBarFigure()
fm1 = FisherMatrix(np.diag([1, 2, 3]), names=['a', 'b', 'c'])
fm2 = FisherMatrix(np.diag([4, 5, 6]), names=['a', 'b', 'c'])
ff.plot_relative_constraints(
    [
        fm1,
        fm2,
    ],
    kind='barh',
    labels=['Matrix 1', 'Matrix 2'],
    scale='log',
    percent=True,
)
ff.axes.set_xlim(1, ff.axes.get_xlim()[-1])
with plt.rc_context(get_default_rcparams()):
    ff.figure.legend(bbox_to_anchor=(0.5, 0.87), loc='lower center')

index

Fix `KeyError` issue when plotting

For some reason, when I run test_plot_2d_figure_error, the test sometimes fails with the following strange error:

>       fp.plot(
            euclid_opt.marginalize_over("Omegam", "Omegab", invert=True),
        )

tests/test_graphics.py:783:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
fitk/graphics.py:2125: in plot
    ax[i, i].remove()
.venv/lib/python3.11/site-packages/matplotlib/artist.py:212: in remove
    self._remove_method(self)
.venv/lib/python3.11/site-packages/matplotlib/figure.py:922: in delaxes
    self._axstack.remove(ax)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <matplotlib.figure._AxesStack object at 0x7fbdf87b4390>, a = <AxesSubplot: >

    def remove(self, a):
        """Remove the axes from the stack."""
>       self._axes.pop(a)
E       KeyError: <AxesSubplot: >

.venv/lib/python3.11/site-packages/matplotlib/figure.py:75: KeyError

Note that I am not sure what the exact source of the error is, as it appears randomly when I run the tests.
I suppose one easy fix is to do something like:

try:
    ax[i, i].remove()
except KeyError:
    pass

though I'm not entirely sure why this is the case as I explicitly perform checks for this beforehand, so this module should never throw any exceptions.

Derivatives: task list

The below are all leftover tasks from #5 that were not mandatory for that PR to get merged.

  • implement third order mixed derivatives and test them
  • (nice to have) figure out how to return coordinate metadata (maybe signal should return the tuple (coords, values)? Or a separate (data) class with properties coords and values). Note that the coordinates are not needed anywhere in the code, they would just be there as a convenience for the user, in case they want to construct the derivatives themselves, or if they want to easily plot the derivatives without re-computing the coordinates.
  • (nice to have) implement adaptive step size so the user doesn't have to guess whether the derivative computed has actually converged (and hasn't experienced underflow)
  • implement interfaces to various third-party software:
  • (nice to have) make a metaclass that can be used to streamline the creation of interfaces to third-party software
  • (nice to have) given an interface to some third-party software, make it possible to specify which parameters we can be varied, so the user does not have to guess ("do I use omega_m? Omega_m? Omega_matter?")
  • (nice to have) implement computation of derivatives using the complex step approximation
  • #42

Comments/metadata lost when reading files

Currently, there's no way to extract comments/metadata from the saved JSON file; unless the user explicitly loads the file via the json package, the metadata cannot be accessed by the FisherMatrix class.
On the other hand, if we put a metadata parameter in the constructor, it's bound to get lost in any arithmetic operations, not to mention that it breaks the single responsibility principle.

Add higher order derivatives

The DALI method (as described here) provides a straightforward generalization to Fisher matrices, and only requires minor tweaks to make it work:

  • instead of having values in the constructor, have *values, i.e. let the user pass n x n, n x n x n, etc. tensors (or just supply the keywords matrix, flexion, quarxion to make it unambiguous)
  • all math operators (with the exception of matrix multiplication) just act on all elements
  • the Jacobian transformation requires a bit of care, but is probably straightforward

Now, what isn't super straightforward is the computation of the (un)marginalized constraints; since we are dealing with a non-Gaussian distribution, in particular (assuming zero mean):

$$ \mathcal{L}(\boldsymbol{\theta}) = \frac{1}{N} \mathrm{exp}(-\frac{1}{2} \boldsymbol{\theta}^T \mathsf{F} \boldsymbol{\theta} - H.O.) $$

we have two problems:

  • we need to figure out the normalizing constant $N$; this can probably be accomplished by using an n-dimensional integration (for instance, scipy.integrate.nquad seems to be adequate for the job of up to, say, 7 parameters, otherwise some MC-based thing is preferred; this paper describes a nice, well-maintained alternative)
  • we need to get the 2D contours somehow, since we can't just invert a Fisher matrix and call it a day. One idea is to integrate over n-2 dimensions, then define iso-probability contours (once we have the normalization, of course), and then find points in the x-y plane which correspond to those contours. Scipy seems to have some root-finding routines for multidimensional problems, but it's not immediately clear how to code everything up.

Refactor the plotting interface

The plotting interface is still somewhat incomplete, and is missing a bunch of features (notably, the legend). Furthermore, the plot_triangle function does a bit too much for my taste; ideally, we would have an interface similar to matplotlib:

from fitk import FisherPlotter

# compute the Fisher matrices
fm1 = FisherMatrix(...)
fm2 = FisherMatrix(...)

# plot them
plotter = FisherPlotter(...) # the constructor should ideally take some arguments, but for what?
plotter.plot_triangle(fm1, ls='--', c='r') # the keyword arguments should ideally be identical to those that matplotlib uses
plotter.plot_triangle(fm1) # the color should use a cycler, so this would plot the first one in the cycler

There are open questions however (incomplete list):

  • what should we do when one of the Fisher matrices does not have the same parameters as the initially plotted one?
  • should plot_triangle return a figure, or just modify the existing one (or both)?
  • how is saving handled?

UPDATE: the above seems to be adequate, although there are some missing features:

  • implement custom shading contours (there is already plot_shading_1d, but it's somewhat half-finished)
  • implement the ability to mark fiducials somehow (on 1D plots, probably best to use vlines, on 2D plots, maybe markers would be better)

Add methods for retrieving the fiducial and LaTeX name from the `FisherMatrix`

Add two methods, fiducial(name: str) -> float, and latex_name(name: str) -> str, which retrieve the fiducial value and LaTeX name, respectively, corresponding to the name name.
While the __getitem__ could in principle be used, it is already used to retrieve the actual values of the Fisher matrix, so it's best to have separate, explicit methods for retrieving the fiducial and the LaTeX name.

Refactor `FisherBarFigure` methods

The methods plot_relative_constraints and plot_absolute_constraints have a lot of code shared among them, and, consequently, it would be nice to refactor them so only one method needs to be updated instead of two.

Figure out scope of the `D` class

Currently, the D class, which is basically a fancy container for parameters required to compute derivatives using finite differences, does not hold information about the weights required to compute them; this is instead delegated to the find_diff_weights function.
The question then arises: should D be responsible for the weights as well (and maybe even the denominator of each parameter)?
A possible implementation is to just add a weights property to D which holds the weights, and maybe even a denominator or prefactor (inverse prefactor?) which holds the abs_step ** order part.

Rationale

The output of find_diff_weights may contain zeros, which just means that those parts of the stencil do not contribute at all to the final result, and currently these zeros are "manually" removed when calling derivative, but this seems somewhat error-prone.
Note that if D automatically removes the zeroes, it also has to remove the corresponding stencil parameter.

Add ability to autoformat tick labels to prevent overlap

Due to how matplotlib does its positioning of tick labels, it's possible to have tick labels that overlap.
It would be nice to find a way to generate tick labels that do not overlap by either:

  1. using a different format for the labels (assuming they are purely textual)
  2. finding a different, but still "nice" position of ticks (using the ticker API), and labeling those instead

Note that the algorithm should take into account the movement of all tick labels (otherwise the plot would not look nice), and provide a scoring function in the end (so it's basically an optimizer).

Provided such an algorithm exists and is general enough, it could even be placed in an external Python module, though I would settle for just an application to this module's plotting requirements.

Handle singular matrices

In some cases, such as parameter degeneracy, the Fisher matrix is singular and cannot be inverted; this has an impact on certain methods:

  • FisherMatrix.inverse
  • FisherMatrix.marginalize_over
  • FisherMatrix.correlation_matrix
  • FisherMatrix.constraints
  • FisherBaseFigure.plot
  • plot_curve_1d
  • plot_curve_2d

The code should catch these errors (numpy.LinAlgError), and handle this somehow. Some suggestions (a combination of these would also work):

  • show a UserWarning
  • when calling any plotting methods, plot parallel lines; this is a bit challenging as autoscale_view would make anything else drawn on the axis look tiny, but this can be fixed somehow (see this and this SO questions).
  • something else?

Add debugging abilities to `FisherDerivative`

For any existing interface, it may be useful to have debugging capabilities, such as printing the current config, the current args, the current result, etc.
These would of course be off by default, and would be passed as kwargs to the signal and covariance methods.

Rename modules accordingly

The proposed scheme, which is probably cleaner for importing:

  • fisher_matrixtensors
  • fisher_derivativederivatives
  • fisher_operationsoperations
  • fisher_utilsutils (or utilities)
  • fisher_plottergraphics (plotters sounds somewhat silly)

Add automatic upload to PyPI

There should be a GH action to automatically push a release to PyPI (or test PyPI) when there is a push to master, and it has a tag of the form x.y.z.

Add plots to examples

A picture is worth a thousand words, and a large part of the package involves the plotting of Fisher objects, so it would be nice to have example plots alongside the documentation.
Due to mitmproxy/pdoc#282, it may be a bit challenging to include images as relative URLs are not supported, so some care is needed to make sure it works properly.

Only build docs on commits with tags

The documentation should only be built in the CI when all of the following is satisfied:

  • master is updated
  • the commit on master is tagged (optionally, the format of the tag should also be specified)

Currently, it gets built every time master is updated, regardless of any tags present in the commits.

Add `validate` method to `FisherDerivative`

It's possible that a user makes a typo on some parameter when calling, say, fisher_matrix, and therefore the whole computation may fail, wasting valuable time.

Proposal

Add a method called validate (or is_valid), which by default returns True, and which a user can override if they want to perform validation of the parameters before doing any computation.
As fitk only has complete control over the derivative and fisher_matrix methods, it should be implemented at that level, rather than in signal or covariance.

`sort` method raises cryptic error if `key=[LIST OF NAMES]` does not contain a name

When calling sort(key=[LIST OF NAMES]), it's possible the user misspells one of the names, or leaves it out; in this case, the code automatically falls back to treating key as a callable, which is probably not what the user intended, so in case key is an iterable of strings, it should raise a more helpful error message, such as:

SomeError: The object [LIST OF NAMES] appears to be an iterable, but does not contain all of the names in the Fisher object ([LIST OF NAMES IN OBJECT]).

Make `reparametrize` method less cumbersome to use

The problem

Suppose we want to make a reparametrization of the Fisher matrix.
Currently, this is achievable, but is somewhat cumbersome; let's say we have the (cosmological) set of parameters {omega_matter, omega_baryon, h, S8, n_s, w0, wa}, and we want to make a change to {Omega_matter, Omega_baryon, h, sigma8, n_s, w0, wa}.
The old parameters are related to the new ones via:

omega_matter = Omega_matter * h**2
omega_baryon = Omega_baryon * h**2
S8 = sigma8 * sqrt(Omega_matter / 0.3)

The following code, using Sympy, can be used to transform one into the other:

import numpy as np
from sympy import Matrix, sqrt, symbols

# old Fisher matrix
fm = ...
# we need to sort the names to be usable!
fm = fm.sort(
    key=[
        "Omega_matter",
        "Omega_baryon",
        "h",
        "sigma8",
        "n_s",
        "w0",
        "wa",
    ]
)

# old ones
omega_m = symbols("omega_m")
omega_b = symbols("omega_b")
h = symbols("h")
s8 = symbols("s8")

# new ones
Omega_m = symbols("Omega_m")
Omega_b = symbols("Omega_b")
sigma8 = symbols("sigma8")

# the factor or normalization
norm = symbols("norm")

# the symbolic Jacobian
result = Matrix(
    [Omega_m * h**2, Omega_b * h**2, h, sigma8 * sqrt(Omega_m / norm)]
).jacobian([Omega_m, Omega_b, h, sigma8])

# the numerical Jacobian
jacobian = np.array(
    result.subs(
        [
            # actual values may differ
            (Omega_m, 0.3),
            (Omega_b, 0.05),
            (h, 0.67),
            (sigma8, 0.83),
            (norm, 0.33),
        ]
    ).evalf()
).astype(np.float64)

# the new Fisher matrix
fm_reparametrized = fm.reparametrize(
    block_diag(jacobian, np.eye(len(fm) - len(jacobian))),
    names=[
        "omega_m",
        "omega_baryon",
        "h",
        "s8",
        "n_s",
        "w0",
        "wa",
    ],
    latex_names=math_mode(
        [
            "\\omega_m",
            "\\omega_b",
            "h",
            "S_8",
            "n_s",
            "w_0",
            "w_a",
        ]
    ),
    fiducials=fm.fiducials
    # actual values may differ
    * np.array(
        [
            0.67**2,
            0.67**2,
            1,
            np.sqrt(0.3 / float(norm.evalf(subs={norm: 0.3}))),
            1,
            1,
        ]
    )
)

Note that there is a lot of useless/redundant code here, notably the setting of names which do not change, as well as the fiducial values, a lot of which are equal to the previous ones. We also need to make sure the Jacobian has the right dimensions, and the parameters must be sorted in a particular order.

In particular:

  1. we need to evaluate the Jacobian at the fiducial
  2. we need to sort the matrix beforehand
  3. we need to set the new LaTeX names and fiducials, half of which are the same as the old ones

Add `__repr__` and/or `__str__` for `FisherDerivative`

This would be useful for displaying the following info:

  • which software a particular interface implements
  • who is the maintainer of the interface
  • at which URLs one can find info about the software
  • the version number

Some open questions:

  • it's possible to have an interface which uses multiple software (one for the signal, one for the covariance); is the above then well defined?
  • should the version number refer to the version of the software itself, or the interface?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.