Git Product home page Git Product logo

rom-tools-and-workflows's Introduction

rom-tools-and-workflows

The ROM tools and workflows Python library comprises a set of algorithms for constructing and exploiting ROMs that rely on abstract base classes that encapsulate all the information needed to run a given algorithm. The philosophy is that, for any given application, the user simply needs to "fill out" a class that meets the required API of the abstract base class. Once this class is complete, the user gains access to all of our existing algorithms.

Documentation

https://pressio.github.io/rom-tools-and-workflows/romtools.html

Installation

cd my-path/rom-tools-and-workflows
pip install .

Verify installation by running the tests

Note: you need pytest installed

cd my-path/rom-tools-and-workflows
pytest

Note: some tests actually generate some auxiliary/temporary files which are handled via the tmp_path as suggested https://docs.pytest.org/en/7.1.x/how-to/tmp_path.html.

Building the documentation

cd <repo-dir>
pdoc ./romtools -t ./custom-template --math --docformat google

this opens a browser with the module documentation. More info about pdoc can be found here

rom-tools-and-workflows's People

Contributors

cwschilly avatar eparish1 avatar fnrizzi avatar jtencer avatar pjb236 avatar rloekvh avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rom-tools-and-workflows's Issues

Update shifter API and implementations

Update shifter API as:

class <SomeShifter>(Shifter):

    def apply_shift(self,my_array: np.ndarray):
       ## apply shift to input array in place

    def remove_shift(self,my_array: np.ndarray):
       ## undo shift vector in place 

Additionally, all operations should be made to be in place in the shifter class. Tests should check for this.

Update shifter API and documentation

Our current API is:

class AbstractShifter(abc.ABC):

    '''

    Abstract implmentation

    '''


    @abc.abstractmethod

    def __call__(self, my_array: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:

        '''

        Overload to apply shift

        '''

        pass

We propose to change this to:

class AbstractShifter(abc.ABC):

    '''

    Abstract implmentation

    '''


    @abc.abstractmethod

    def shiftTensor(self, my_array: np.ndarray):
        '''
        In place operation that shifts the tensor
        '''
        pass

    def getShift(self) -> np.ndarray:
        '''
        Returns the shift value
        '''
        pass

Additionally, clean up documentation:

  • Misspelling on page "Abstract implmentation".
  • Does the VectorShifter require the vector to be of the same size as the first dimension of the array? If so, there should be a check.

Fix snapshot demo

The snapshot demo needs to be updated for the new case where the snapshot class is non existent. (Demo may now be irrelevant)

deim: unused function and wrong call

deim.py has vectorDeimGetIndices which is unused/untested and has a wrong call inside.
Leaving the function commented out for now
This needs to be fixed

Linear algebra libraries: mpi4py light library and pressio-tools full library.

To add concrete implementations that can be run parallel (row distributed memory), we need a variety of basic algebra and linear algebra functions.
We want to be able to create a new library that has two usages:

  1. a "light mode":
  • installable via pip install ... and has pure python dependencies (if any)
  • supports some of the basic functions so that a user can utilize most of the rom-tools library without headaches
  1. a "heavy more performant mode":

NOTE: All functions should target row distributed parallelization

For the python-only library mode 1 we will support

  • min max of a distributed vector
  • A^T B, where A and B are both row distributed
  • Hacks to do the SVD and orthogonalization via method of snapshots
  • Stretch/unlikely: svd, qr, basic linear solve.

For the more heavy library mode 2 we will support

  • min max of a distributed vector
  • A^T B, where A and B are both row distributed
  • Hacks to do the SVD and orthogonalization via method of snapshots
  • svd
  • qr
  • linear solve

summary of current tests: w or w/o MPI, ranks, etc

make a summary of all tests currently present in the code so that we have an idea of which ones are serial, which ones use MPI, how many ranks ,etc
something like:

  • test_name1: serial, skipped with mpi
  • test_name2: mpi, min # ranks 3 else no-op
  • test_name3: mpi, min # ranks 3
    etc

Create list for required MPI changes

We are working towards having rom-tools-and-workflows work fully in parallel. For planning purposes, we need to:

  • Go through each abstract class, and check if there is any reason an abstract class would not be compatible with MPI (it seems like this shouldn't be an issue)
  • Go through each concrete implementation of a base class, and check if there is a reason a concrete class would not be compatible with MPI (e.g., uses min, max, svd without an svd member, etc.).
  • Summarize in list

Splitter implementation

The current splitter operates by creating a new matrix that is an expanded version of the existing matrix. We then perform SVD on the expanded matrix. This can be made more efficient by having splitter return information about the splitting, and then we efficiently leverage this to do SVD. This requires some thought, and this issue is to begin this process.

Support for heterogeneous parameter spaces

Current parameter space classes only support homogenous spaces. It would be good to support heterogeneous cases such as a mix of normal and uniform distributions, nonvariable parameters (constants), and non-numeric parameters. The ParameterSpace base class may need to be reworked to accommodate.

Rename TrialSpace as VectorSpace and create a derived TrialSpace concept.

Right now, we represent a linear subspace with a "TrialSpace". As we move to support more hyper-reduction techniques, we need to construct linear subspaces and it would be good to re-use what's in TrialSpace. We will handle this by re-naming TrialSpace as VectorSpace. Downstream we can then create the notion of a TrialSpace which has state-specific functions, i.e., mapFromReducedState, mapToReducedState, etc.

Remove snapshot data from rom-tools

We are planning on removing the snapshot data class and just have, e.g., trial space work directly with the snapshot tensor. Any new comments/thoughts on this should be placed here.

fix `ERROR: not found` CI failures

The error, as seen here:

 ============================ no tests ran in 0.25s =============================
ERROR: not found: /home/runner/work/rom-tools-and-workflows/rom-tools-and-workflows/tests/romtools/__pycache__
(no match in any of [<Dir romtools>])

First appeared on this commit.

Small improvements to truncater

  • There's an extra description under EnergyTruncater that does not make sense. Also recommend an example or recommended value for the threshold.
  • There should be a warning in BasisSizeTruncater if you try to truncate larger than the actual size of the basis (or accidentally put a negative number or zero).
  • General review of documentation

Add additional Parameter and ParameterSpace implementations

Need to add the following additional concrete implementations of the romtools.workflows.parameter_spaces.Parameter class

  • GaussianParameter
  • TriangularParameter

which support sampling from Gaussian and Triangular distributions respectively.

Also need to add the class

  • GaussianParameterSpace

which will subclass romtools.workflows.parameter_spaces.HomogeneousParameterSpace.

Small ECSW improvements

  • Update documentation for inputs/outputs to conform to the rest of the code
  • Potentially use more expressive variable names (e.g., xi->weights, inds-> sample_mesh_indices, tau->tolerance). Thoughts?
  • Is there any reason that xi is of size N instead of N_s?
  • Could we make inds a numpy array instead of a list? I personally like this more, but this doesn't need to be the case.
  • In the ecsw_fixed_test_basis, the default solver is the abstract solver. Should we change this to be the concrete implementation?

General formatting guidelines

Here is a general template for a source file (most items are checked by the pylint check)

#
# ************************************************************************
#          LICENSE (can use add-sandia-license.sh in helper_scripts)
# ************************************************************************
#

'''
Put docstring documentation in single quotes. Try to limit line lengths to 130 characters.
'''

import abc       # standard libraries first
import numpy as np
from romtools.trial_spac import AbstractTrialSpace    # rt imports last


class AbstractClass(abc.ABC): # PascalCase for classes (and two newlines between imports and code)
    '''
    Describe what the class does
    '''
    def __init__(self, arg_1): # all methods should take self as first argument
        '''
        Constructor documentation

        Args:
            arg_1 (<type>): description of argument
        '''
        # code here

    def someMethod(self, arg_1, arg_2): # camelCase for methods/functions, snake_case for variables/arguments
        '''
        Brief description

        Args:
             arg_1 ():
             arg_2 ():

        Returns:
        '''
        # code here

As noted in #24 , ChatGPT can be helpful for generating consistent documentation for methods in a class.

revise classes documentation so that they all conform to a certain starndard

out of curiosity i tried to ask chatgpt to write the documentation for the AbstractSnapshotData

image

and I got this response:

class AbstractSnapshotData(abc.ABC):
    """
    An abstract base class for representing snapshot data.

    This class defines the common interface for classes that store and provide access to snapshot data
    as part of a simulation or data processing system. Implementations of this class are expected to
    define the initialization method and various methods for accessing and manipulating the data.

    Attributes:
        var_names (list): A list of variable names associated with the snapshot data.

    Methods:
        __init__(self, **kwargs):
            Initializes an instance of the AbstractSnapshotData class. Subclasses should implement
            this method to set up the necessary data structures or connections to data sources.

        getSnapshotsAsListOfArrays(self) -> Iterable[np.ndarray]:
            Retrieve the snapshots as an iterable of NumPy arrays. Each array represents a single snapshot.

        getMeshGids(self):
            Retrieve the mesh global identifiers associated with the snapshots.

        getSnapshotsAsArray(self) -> np.ndarray:
            Retrieve the snapshots as a single NumPy array by converting the list of snapshots into an array.

        getVariableNames(self) -> list:
            Retrieve the list of variable names associated with the snapshot data.

        getNumVars(self) -> int:
            Get the number of variables (dimensions) in the snapshot data.

    Note:
    - Subclasses should implement the abstract methods `__init__()`, `getSnapshotsAsListOfArrays()`,
      and `getMeshGids()`.
    - The attribute `var_names` should be set in the subclass constructor to define the variable names.
    - The `getSnapshotsAsArray()` and `getNumVars()` methods provide convenient access to the data.
    """

    @abc.abstractmethod
    def __init__(self, **kwargs):
        pass

    @abc.abstractmethod
    def getSnapshotsAsListOfArrays(self) -> Iterable[np.ndarray]:
        pass

    @abc.abstractmethod
    def getMeshGids(self):
        pass

    def getSnapshotsAsArray(self) -> np.ndarray:
        snapshot_array = listOfSnapshotsToArray(self.getSnapshotsAsListOfArrays())
        return snapshot_array

    def getVariableNames(self) -> list:
        return self.var_names

    def getNumVars(self) -> int:
        return len(self.getVariableNames())

then i asked to do

image

and i got this:

class AbstractSnapshotData(abc.ABC):
    """
    An abstract base class for representing snapshot data.

    This class defines the common interface for classes that store and provide access to snapshot data
    as part of a simulation or data processing system. Implementations of this class are expected to
    define the initialization method and various methods for accessing and manipulating the data.

    Attributes:
        var_names (list): A list of variable names associated with the snapshot data.

    Methods:
    """

    @abc.abstractmethod
    def __init__(self, **kwargs):
        """
        Initializes an instance of the AbstractSnapshotData class. Subclasses should implement
        this method to set up the necessary data structures or connections to data sources.

        Args:
            **kwargs: Additional keyword arguments that subclasses may accept for configuration.

        Note:
        Subclasses must call this base class constructor and set the `var_names` attribute to
        define the variable names associated with the snapshot data.
        """
        pass

    @abc.abstractmethod
    def getSnapshotsAsListOfArrays(self) -> Iterable[np.ndarray]:
        """
        Retrieve the snapshots as an iterable of NumPy arrays. Each array represents a single snapshot.

        Returns:
            Iterable[np.ndarray]: An iterable of NumPy arrays representing the snapshots.

        Note:
        Subclasses must implement this method to provide access to the actual snapshot data.
        """
        pass

    @abc.abstractmethod
    def getMeshGids(self):
        """
        Retrieve the mesh global identifiers associated with the snapshots.

        Returns:
            None or specific data type: The mesh global identifiers, or None if not applicable.

        Note:
        Subclasses must implement this method to provide access to mesh global identifiers if relevant.
        """
        pass

    def getSnapshotsAsArray(self) -> np.ndarray:
        """
        Retrieve the snapshots as a single NumPy array by converting the list of snapshots into an array.

        Returns:
            np.ndarray: A NumPy array containing all the snapshots.

        Note:
        This method provides a convenient way to access the snapshot data as a single array.
        Subclasses can use the `getSnapshotsAsListOfArrays()` method to implement this.
        """
        snapshot_array = listOfSnapshotsToArray(self.getSnapshotsAsListOfArrays())
        return snapshot_array

    def getVariableNames(self) -> list:
        """
        Retrieve the list of variable names associated with the snapshot data.

        Returns:
            list: A list of variable names.

        Note:
        Subclasses should ensure that this list is properly defined and set in the constructor.
        """
        return self.var_names

    def getNumVars(self) -> int:
        """
        Get the number of variables (dimensions) in the snapshot data.

        Returns:
            int: The number of variables.

        Note:
        Subclasses should make sure that this method returns the correct number of variables
        associated with the snapshot data.
        """
        return len(self.getVariableNames())

testing with mpi: decide on a min/max # of ranks

we need to decide if tests when run with mpi need to have a min/max # of ranks.
For example we could:

  • not set any rule on this: any test does whatever but then this creates an issue for the CI
  • decide only on the min # of ranks to use so that all MPI test should conform to this but then how do we set the # of ranks to use in the CI?
  • decide that every MPI test MUST be wirtten to work with a min and a max # of ranks

fix order of submodules when rendered

by default, pdoc uses alphabetical order for odering the submodules when rendered.
in #10 we tried to fix the order since it makes more sense to use a "logical" one (topdown) kind of, so that it is easier to follow.

The hack used in that PR however did not work because was breaking the links of the submodules pages.

I also opened an issue mitmproxy/pdoc#630 directly for pdoc.
while this is worked out, it would be good to find a temporary solution.

Reformat snapshot data to be in tensor form

As we discussed, we are going to refactor snapshot data to be in the form of a tensor:

snapshots \in \RR{ N_vars x N_gridpoints x N_samples }

Is everyone ok with this shape?

Internally, when we leverage the snapshots to do, e.g., POD, we will reshape them into a matrix of size N_vars N_gridpoints x N_samples. We have the option of adapting order "F" convention to do this, or order "C" convention to do this. Order "F" will order the states as u_1, v_1,w_1, u_2,v_2, ... , while "C" will block the variable sets.

I would vote for order "F", even though that's not the preferred way for Python. The alternative is to set up the snapshot shape as snapshots \in \RR{ N_gridpoints x N_vars x N_samples }, in which case we could reshape with order "C" to have u_1, v_1,w_1, u_2,v_2,.

Any preferences?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.