equadratures / equadratures Goto Github PK

License: MIT License

Python 99.86% TeX 0.14%

polynomials quadratures least-squares interpolation approximation effective-quadratures python matrix-factorization

equadratures's Introduction

equadratures

equadratures is an open-source library for uncertainty quantification, machine learning, optimisation, numerical integration and dimension reduction -- all using orthogonal polynomials. It is particularly useful for models / problems where output quantities of interest are smooth and continuous; to this extent it has found widespread applications in computational engineering models (finite elements, computational fluid dynamics, etc). It is built on the latest research within these areas and has both deterministic and randomised algorithms.

Key words associated with this code: polynomial surrogates, polynomial chaos, polynomial variable projection, Gaussian quadrature, Clenshaw Curtis, polynomial least squares, compressed sensing, gradient-enhanced surrogates, supervised learning.

Code

The latest version of the code is v10 Baby Blue, released March 2022.

If you use pip you can install the code with:

pip install equadratures

or pip can be replaced with python -m pip, where python is the python version you wish to install equadratures for. Use of a virtual enviroment such as virtualenv or pyenv/pipenv is also encouraged. Alternatively you can click either on the Fork Code button or Clone, and install from your local version of the code.

For issues with the code, please do raise an issue on our Github page; do make sure to add the relevant bits of code and specifics on package version numbers. We welcome contributions and suggestions from both users and folks interested in developing the code further.

Our code is designed to require minimal dependencies; current package requirements include numpy, scipy and matplotlib.

If, and only if using the GraphPolys class, additional requirements would include the networkx, and torch modules.

Documentation, tutorials, Discourse

Code documentation and details on the syntax can be found here.

We've recently started a Discourse forum! Check it out here.

Code objectives

Specific goals of this code include:

probability distributions and orthogonal polynomials
supervised machine learning: regression and compressive sensing
numerical quadrature and high-dimensional sampling
transforms for correlated parameters
computing moments from models and data-sets
sensitivity analysis and Sobol' indices
data-driven dimension reduction
ridge approximations
surrogate-based design optimisation

Get in touch

Feel free to follow us via Twitter or email us at [email protected].

Community guidelines

If you have contributions, questions, or feedback use either the Github repository, or get in touch. We welcome contributions to our code. In this respect, we follow the NumFOCUS code of conduct.

Acknowledgments

This work was supported by wave 1 of The UKRI Strategic Priorities Fund under the EPSRC grant EP/T001569/1, particularly the Digital Twins in Aeronautics theme within that grant, and The Alan Turing Institute.

equadratures's People

Contributors

Stargazers

Watchers

equadratures's Issues

Plotting a parameter issues matplotlib warning

Running the jupyter notebook http://nbviewer.jupyter.org/github/Effective-Quadratures/Effective-Quadratures/blob/master/ipython-notebooks/PolynomialChaosPDFs.ipynb gives a warning when plotting Parameters:

usr/local/lib/python2.7/dist-packages/equadratures-5.1-py2.7.egg/equadratures/plotting.py:168: MatplotlibDeprecationWarning: The set_axis_bgcolor function was deprecated in version 2.0. Use set_facecolor instead.
  ax.set_axis_bgcolor('whitesmoke')

eq-datasets not able to use the correct git-url with windows

It seems that the eq.datasets.load_eq_datasets() function is not able to use the correct url to download dataset for windows, it converts '\' to '%5C', due to which the function fails to work.

The output error is
400 Client Error: Bad Request for url: https://raw.githubusercontent.com/Effective-Quadratures/data-sets/main%5Cprobes%5CREADME.md

Weighting and normalizing

Either there is a bug here or with qr col pivoting...ipynb answers aren't quite right.

Write up missing "Quick Start" tutorials

SyntaxWarning with python 3.8+

In preparation for python 3.8+ compatibility, a number of is and is not checks in poly.py and basis.py need replacing. See below...

The compiler now produces a SyntaxWarning when identity checks (is and is not) are used with certain types of literals (e.g. strings, numbers). These can often work by accident in CPython, but are not guaranteed by the language spec. The warning advises users to use equality tests (== and !=) instead. (Contributed by Serhiy Storchaka in bpo-34850.)

L1 minimizer

Convert all instances of np.matrix to np.ndarray

It seems that np.matrix will be deprecated in the future. Thus, all instances of np.matrix should be changed to np.ndarray (e.g. the output of evaluatePolyFit)

In addition, we should stipulate that all 1-d outputs be represented by 1-d arrays instead of "column vectors" with size = (length, 1). It makes working with them more intuitive.

New inputs for poly

    Definition of a polynomial object.

    Parameters
    ----------
    parameters : list
        A list of parameters, where each element of the list is an instance of the Parameter class.
    basis : Basis
        An instance of the Basis class corresponding to the multi-index set used.
    method : string
        The method used for computing the coefficients. Should be one of:
            - 'compressive-sensing'
            - 'numerical-integration'
            - 'least-squares'
            - 'minimum-norm'
    samples : {string, dict}
        The first argument to this input specifies the sampling strategy. Avaliable options are:
            - 'monte-carlo'
            - 'latin-hypercube'
            - 'induced-sampling'
            - 'christoffel-sampling'
            - 'sparse-grid'
            - 'tensor-grid'
            - 'user-defined'
        The second argument to this input is a dictionary, that naturally depends on the chosen string.
        Note that 'monte-carlo', 'latin-hypercube', 'induced-sampling' and 'christoffel-sampling' are random
        sampling techniques and thus their output will vary with each instance; initialization of a random seed
        is recommended to facilitate reproducibility. All these four techniques and 'tensor-grid' have a similar
        dict structure, that comprises of the fields:
            sampling-ratio : double
                The ratio of the number of samples to the number of coefficients (cardinality of the basis). Should
                be greater than 1.0 for 'least-squares'.
            subsampling-optimisation: str
                The type of subsampling required. In the aforementioned four sampling strategies, we generate a logarithm
                factor of samples above the required amount and prune down the samples using an optimisation technique.
                Avaliable options include:
                    - 'qr'
                    - 'lu'
                    - 'svd'
                    - 'newton'
        There is a separate dictionary structure for a 'sparse-grid'; the dictionary has the form:
            growth-rule : string
                Two growth rules are avaliable:
                    - 'linear'
                    - 'exponential'
                The growth rule specifies the relative increase in points when going from one level to another
            level : int
                The level parameter dictates the maximum degree of exactness associated with the quadrature rule.
        Finally, for the 'user-defined' scenario, there are two inputs in the dictionary:
            input : numpy ndarray
                The shape of this array will have size (observations, dimensions).
            output : numpy ndarray
                The shape of this output array will have size (observations, 1).

Add functionality to install with pip

Multivariate "A"

Check this.

Clean up stats.py

Current data structures used for Sobol' and related indices are awkward and unexpected.
Need to think about the workflow of using stats.

Outdated Documentation

It seems that the documentation mentions plotting utilities but I can't find them in the code base, or use them with the package. They are similarly present in the Parameter documentation as well. I'm not sure whats the actual source of this documentation on the equadratures website to PR and fix this.

Quadrature rule generation error

The quadrature rule generation seems to break when the order is increased above 4. This occurs for both tensor order and total order grids. See the attached python 2.7 code to replicate the bug. (included as a .txt file)
bug_report.txt

Spam

Spam doesn't output all the coefficients. Need to sort this out.

Dependency Issues

Currently the setup specifies a dependency of 'scipy >= 0.15.0'. However the optimisation module imports scipy.linalg.null_space which was added later (appears to have been introduced in Scipy 1.1.0: https://docs.scipy.org/doc/scipy/reference/release.1.1.0.html).

Not that it's an issue for you but our internal python environment is currently limited to scipy 0.18.1 so the fix for us isn't as straightforward as just updating the scipy package we use. On our local copy of the code I've needed to comment out the import and manually implement the null_space function. My preference for a (somewhat ugly) fix would probably be to leave the dependency as is but to wrap the import in a try/except with a local implementation of null_space defined if it doesn't work.

Also I think datasets.py imports requests which isn't currently listed as a dependency. It works with requests 2.11.1.

Multi-Index Set Tutorial

In the tutorial Multi-Index inside the official Documentation, instead of using the class IndexSet, Basis has to be replaced together with the use of the method 'sort' to calculate the attributes 'elements';
also, another little comment about the tutorial Polynomial Regression for time varying data: it does not report the lines in which the array yy is created (normalization of volume values).
Thanks a lot !

Removal of unused functions

To improve coverage, perhaps we should remove some unused functions:

utils.py: Everything except columnNormalize, cell2matrix, evalfunction, find_repeated_elements. evalfunction could be replaced with the numpy's apply_along_axis.
qr.py: Everything except solveCLSQ?
polyreg.py: F_stat (I will do this)
polylsq.py: Merge into polyreg?

Will include more later...

from equadratures import * fails using python 3.5

Having installed with sudo python3 setup.py install importing eqaudratures fails as follows:

jshaw@swpc215-vm ~ $ ipython3
Python 3.5.3 (default, Jan 19 2017, 14:11:04) 

In [1]: from equadratures import *
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-40b72a151ef6> in <module>()
----> 1 from equadratures import *

/usr/local/lib/python3.5/dist-packages/equadratures-5.1-py3.5.egg/equadratures/__init__.py in <module>()
      1 'Initialization file'
----> 2 from polyint import Polyint
      3 from parameter import Parameter
      4 from polylsq import Polylsq
      5 from polyreg import Polyreg

ImportError: No module named 'polyint'

This is using version 215df25. The same code installs with python 2.7.13 without issue.

Version 8.0 Ideas

Tentative idea: Load the Poly class:

def __init__(self, parameters, basis, task):

        # Tasks:
        # 1. uncertainty quantification --> doe
        # 2. dimension reduction --> doe
        # 3. optimization --> doe
        # 4. regression --> data 
        # 5. read in old file

As part of this version, it would be useful to create a file format that can read-in a prior polynomial.

Saving a polynomial

In the implementation given below, the polynomial class needs to know the attributes of other classes--i.e., basis and parameter. It may be better if each of the other classes is endowed with lxml functionality.


    def savePolynomial(self, filename):
        """
        Saves the polynomial as an xml file.

        :param Poly self:
            An instance of the Poly class.
        :param string filename:
            Path location and filename.

        """
        root = tree.Element("polynomial")
        polydoc = tree.SubElement(root, "equadratures")
        tree.SubElement(polydoc, "version_number").text = str(VERSION_NUMBER)
        tree.SubElement(polydoc, "date").text = str(time.strftime("%d/%m/%Y"))

        basisinfo = 

        polydoc_basis = tree.SubElement(root, "multi-index")
        tree.SubElement(polydoc_basis, "basis_type").text = str(self.basis.basis_type)
        tree.SubElement(polydoc_basis, "maximum_orders").text = str(self.basis.orders)

        polydoc_coefficients = tree.SubElement(root, "coefficients")
        tree.SubElement(polydoc_coefficients, "coefficients").text = str(self.coefficients)
        tree.SubElement(polydoc_coefficients, "index_set").text = str(self.basis.elements)

        polynomialobject = tree.ElementTree(root)
        polynomialobject.write(filename, pretty_print=True)

Syntax improvements to Poly

Poly should facilitate rapid extrapolation. Rather than use

X2 = np.linspace(-1, 1., 100).reshape(100,1)
X = np.random.rand(20, 1)
y = X**2
p = Parameter(distribution='uniform', lower=-1., upper=1., order=3)
basis = Basis('univariate')
poly = Poly(p, basis, method='least-squares', sampling_args={'sample-points': X, 'sample-outputs': y})
poly.set_model()
f = poly.get_polyfit(X2)

It would simplfy workflows if method='least-squares' could be assumed by default, and the values in sampling_args could be provided as inputs permitting the following to work. Checks on the shape of X and y must ensure that the number of rows in X is equivalent to the number of rows in y and that y only has 1 column.

Also note that in the above the code should be OK with sampling_args={'sample_points': X, 'sample_outputs': y}

To do

effectivequads.py
integration_utils & pseudospectral_utils.py
Comment qr.py
Check least squares with constraints implementation

PolyTree logging issue

Using PolyTree with the following parameters returns an error when logging=True:

order = 1
max_depth = 3

polytree = PolyTree(splitting_criterion='model_agnostic', max_depth=max_depth, min_samples_leaf=None, 
         k=0.5, order=order, basis='hyperbolic-basis', search='grid',samples=50,
         logging=True, poly_method='least-squares', poly_solver_args=None)
polytree.fit(X_train_eq,y_train.to_numpy())

The relevant error output is:

in _splitter(node)
    192                                                         # Update best parameters if loss is lower
    193                                                         if loss_split < loss_best:
--> 194                                                                 if self.logging: self.log.append({'event': 'best_split', 'data': {'j_feature':j_feature, 'threshold':threshold, 'loss': loss_split, 'poly_left': poly_left, 'poly_right': poly_right}})
    195                                                                 did_split = True
    196                                                                 loss_best = loss_split

UnboundLocalError: local variable 'poly_left' referenced before assignment

Double checking vegetation parameter space

Can you double check the vegetation input parameter space ?

27 [ 1.44300000e+02, 2.35000000e-01, 6.00000000e-03, 6.00000000e-04],
28 [ 6.21152934e+01, 1.74581460e-01, 2.90161332e-03, 6.00000000e-04],
29 [ 2.26484707e+02, 2.95418540e-01, 6.00000000e-03, 2.90161332e-04],
30 [ 1.44300000e+02, 1.74581460e-01, 9.09838668e-03, 9.09838668e-04],
31 [ 1.44300000e+02, 2.95418540e-01, 2.90161332e-03, 9.09838668e-04],
32 [ 6.21152934e+01, 2.95418540e-01, 6.00000000e-03, 2.90161332e-04],
33 [ 2.26484707e+02, 1.74581460e-01, 6.00000000e-03, 2.90161332e-04],
34 [ 1.44300000e+02, 2.95418540e-01, 9.09838668e-03, 9.09838668e-04],
35 [ 1.44300000e+02, 2.35000000e-01, 9.09838668e-03, 2.90161332e-04],
36 [ 2.26484707e+02, 2.35000000e-01, 6.00000000e-03, 9.09838668e-04],
37 [ 1.44300000e+02, 2.35000000e-01, 2.90161332e-03, 2.90161332e-04],
38 [ 6.21152934e+01, 2.35000000e-01, 6.00000000e-03, 9.09838668e-04],
39 [ 6.21152934e+01, 1.74581460e-01, 6.00000000e-03, 2.90161332e-04],
40 [ 6.21152934e+01, 2.35000000e-01, 9.09838668e-03, 6.00000000e-04],
41 [ 1.44300000e+02, 1.74581460e-01, 2.90161332e-03, 9.09838668e-04]])

Fix deprecated/removed numpy attribute `asscalar`.

On newer versions of numpy, the attribute asscalar has been deprecated (in version 1.16) and removed as of version 1.23.

Older versions will trigger the following error:

AttributeError: module 'numpy' has no attribute 'asscalar'

Solution should be to switch to numpy.ndarray.item() instead.

Lines to fix:

equadratures/equadratures/optimisation.py

Line 62 in c6faba2

objective = lambda x: k*np.asscalar(f(x))

equadratures/equadratures/optimisation.py

Line 250 in c6faba2

constraint = lambda x: np.asscalar(g(x))

equadratures/equadratures/optimisation.py

Line 344 in c6faba2

self.f_old = np.asscalar(self.f[ind_min])

equadratures/equadratures/optimisation.py

Line 409 in c6faba2

return np.asscalar(f)

equadratures/equadratures/optimisation.py

Line 845 in c6faba2

 res = optimize.minimize(lambda x: np.asscalar(my_poly.get_polyfit(x)), self.s_old, method='TNC', \ 

equadratures/equadratures/optimisation.py

Line 848 in c6faba2

 res = optimize.minimize(lambda x: np.asscalar(my_poly.get_polyfit(np.dot(x,self.U))), self.s_old, \ 

equadratures/equadratures/optimisation.py

Line 935 in c6faba2

del_m = np.asscalar(my_poly.get_polyfit(self.s_old)) - m_new

equadratures/equadratures/optimisation.py

Line 997 in c6faba2

del_m = np.asscalar(my_poly.get_polyfit(np.dot(self.s_old,self.U))) - m_new

equadratures/equadratures/sampling_methods/induced.py

Line 402 in c6faba2

F = np.asscalar(F)

equadratures/equadratures/solver.py

Line 493 in c6faba2

fe = 0.5*(np.asscalar(np.dot(r.T,r)) - epsilon**2)

equadratures/equadratures/solver.py

Line 534 in c6faba2

cqe = np.asscalar(np.dot(r.T,r)) - epsilon**2

equadratures/equadratures/subspaces.py

Line 665 in c6faba2

dV[:,l,:,j] = np.asscalar(vectord[l])*(X.T*current[:,j])

Improvements to Parameter

Parameter should have a default uniform distribution and a default order of 1.

p = Parameter(lower=-1., upper=1.)
# TypeError: __init__() missing 2 required positional arguments: 'order' and 'distribution'

Parameter should take in a name in its constructor. If a name is not provided a default value of 'parameter' can be assigned.

horsepower = Parameter(name='horsepower', lower=180, upper=350)

Travis CI tests failing with python 3.5

Failing due to cvxpy/numpy depency mismatch. Full error below:

$ pip install cvxpy>=1.1
Could not find a version that satisfies the requirement numpy==1.19 (from versions: 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0rc1, 1.13.0rc2, 1.13.0, 1.13.1, 1.13.3, 1.14.0rc1, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0rc1, 1.15.0rc2, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0rc1, 1.16.0rc2, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0rc1, 1.17.0rc2, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0rc1, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5)
No matching distribution found for numpy==1.19
The command "pip install cvxpy>=1.1" failed and exited with 1 during .

Adding another class of distributions.

This is a feature request for adding another class of distributions for equadratures.distributions.

Supporting Distributions like:

Binomial Distribution
Poisson Distribution
Multinomial Distribution

Would work on this if approved.

Local import Issue with the parameter class

This could be an issue throughout the repository. However, I have only tried to import parameter.py , hence encountered this issue here.

Python Version:

python3.6.5

Encountered Error:

> File "/path/to/parameter.py", line 2, in \ > from .distributions.gaussian import Gaussian > ImportError: attempted relative import with no known parent package

Expected Behaviour:

No error returned for import

Reproduction steps:

change the current working directory to Effective-Quadratures/equadratures/ execute > $ python3

In python console, type

from parameter import Parameter

The error above should be encountered,
Note that doing

from .parameter import Parameter

Should simply give

File "", line 1
import .parameter as Parameter
^
SyntaxError: invalid syntax

Hyperbolic cross

Double check and make sure "q" parameter can be given as an input.

QR column pivoting without storing Q or R

Current pivoting technique is memory intensive. Need to investigate other techniques

def qrColumnPivoting_mgs(A):

equadratures / equadratures Goto Github PK

equadratures's Introduction

equadratures

Code

Documentation, tutorials, Discourse

Code objectives

Get in touch

Community guidelines

Acknowledgments

equadratures's People

Contributors

Stargazers

Watchers

Forkers

equadratures's Issues

Python Version:

Encountered Error:

Expected Behaviour:

Reproduction steps:

Recommend Projects

Recommend Topics

Recommend Org