Uncertainty Quantification in Julia

License: MIT License

Julia 100.00%

julia metamodeling monte-carlo-simulation reliability-analysis sensitivity-analysis uncertainty-quantification

uncertaintyquantification.jl's People

Contributors

Stargazers

Watchers

Forkers

sitoryu gopal1259 mlsuh cr0gan andergray roberock yyimingucl

uncertaintyquantification.jl's Issues

Gaussian mixture models

Using https://github.com/davidavdav/GaussianMixtures.jl maybe?

Mappings between standard normal and physical space will not exist and the methods should be implemented to throw errors when called.

CD back to the old pwd when the external solver fails

Currently when an external solver encounters an error, Julia will get stuck in the working directory of the current sample. This creates issues when the source paths etc. are defined using pwd() and joinpath().

Solvers and Models

Regarding the overall structure I think what we should do is something like this:

A model connects several inputs to one or multiple solvers?.
The solvers are executed in order. The most basic solver being a julia function that you pass in. Somewhat like the Mio we have in OpenCossan.
Everything just passes around DataFrames. A Solver would just run whatever it needs to and adds the results to the DataFrame of the inputs. The end result would then collect all inputs and output values for whatever one might want to do with them afterwards.

(Pseudo-)Example:

a = 5
b = Normal()

function add (df::DataFrame) 
    df.c = df.a .+ df.b
    return df
end

solver = Solver(add)

model = Model([a b], ["a" "b"], [solver])

result = run(model, samples=1000)

Something like this, only more complex 🤔.

This could then be extended to include ObjectiveFunctions for reliability analysis or advanced sampling algorithms, etc.

Randomized quasi Monte Carlo

Since v0.3 QuasiMonteCarlo.jl includes various randomization methods for all its sampling methods. We should enable these as well, with sane defaults since some of the sequences should generally not be used unrandomized.

DigitalShift
MatousekScramble
OwenScramble
Shift

Implement LatticeRuleSampling

The last QMC method I'd like to have implemented is the one based on rank-1 lattice rules. Implementation should be straight forward in line with the other QMC methods.

s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())

Tests should be added to test the first few values like with the other methods.

Sobol indices and QMC

Using QMC to compute sobol indices yields very different results based on the method.

using UncertaintyQuantification

a = Parameter(7, :a)
b = Parameter(0.1, :b)

x = RandomVariable.(Uniform(-π, π), [:x1, :x2, :x3])

inputs = [a, b, x...]

ishigami = Model(
    df -> sin.(df.x1) + df.a .* sin.(df.x2) .^ 2 + df.b .* df.x3 .^ 4 .* sin.(df.x1),
    :f,
)

n = 10000

s_mc = sobolindices([ishigami], inputs, :f, MonteCarlo(n))
s_ss = sobolindices([ishigami], inputs, :f, SobolSampling(n))
s_hs = sobolindices([ishigami], inputs, :f, HaltonSampling(n))
s_lhs = sobolindices([ishigami], inputs, :f, LatinHypercubeSampling(n))

Just comparing the first order indices:

[0.3038509495659772, 0.4531211968732472, -0.004581507395695991] (Monte Carlo)
[0.0259111519070014, 0.4290425029042932, 0.21306096653679016] (Sobol)
[0.011524353591864793, 0.3774794907941725, 3.188982809095301e-5] (Halton)
[0.3114654127984809, 0.43890237167581103, 0.0005408956632518976] (LatinHypercube)

Monte Carlo and LatinHypercube are able to appoximate the analytical solution while Sobol and Halton are nowhere close.

Since LatinHyperCube, Sobol and Halton all go through the same sampling code, it can't be an issue there.

@AnderGray Any idea? I'm at a loss here.

Implement Quasi Monte Carlo Methods

Currently we have Sobol and Halton from two different packages. It might be a good idea to switch to QuasiMonteCarlo.jl and implement everything it offers.

QMC.jl includes:

GridSample(dx) where the grid is given by lb:dx[i]:ub in the ith direction.
UniformSample for uniformly distributed random numbers.
SobolSample for the Sobol sequence.
LatinHypercubeSample for a Latin Hypercube.
LatticeRuleSample for a randomly-shifted rank-1 lattice rule.
LowDiscrepancySample(base) where base[i] is the base in the ith direction.

I vaguely remember an issue with creating high dimensional LH samples in this but I am not sure.

SubSet returns NaN cov for 1 rv

This is probably related to the correlation estimation of the markov chains. For 1 rv we can most likely fall back to the standard MC way of computing the pf.

estimate_cov for subset and subset infinity sometimes returns complex number

Performing convergence study for MC, subset, and subset infinity.

Sometimes (randomly) estimate_cov is trying to take the sqrt of a complex number. Script for the study

using UncertaintyQuantification, DelimitedFiles, Formatting
using Distributions, Plots

###
#   Define input random variables
###
X1 = RandomVariable(Normal(0, 1), :X1)
X2 = RandomVariable(Normal(0, 1), :X2)

inputs = [X1, X2]

###
#   Define function, and model
###
function y(df)
    return df.X1 .+ df.X2
end

m = Model(y, :y)

###
#   Define limitstate. Failure is limitstate <= 0
###
function limitstate(df)
    return 4 .- reduce(vcat, df.y)
end

###
#   True failure for this problem
###
True_failure_prob = 1 - cdf(Normal(), 4/sqrt(2))

N_samples = 10 .^range(2, 5, 50)

pf_1 = zeros(length(N_samples))
pf_2 = zeros(length(N_samples))
pf_3 = zeros(length(N_samples))

cov_1 = zeros(length(N_samples))
cov_2 = zeros(length(N_samples))
cov_3 = zeros(length(N_samples))


for (i, N) in enumerate(N_samples)
    println(i)
    ###
    #   Try different simulation methods
    ###
    simulation_method_1 = MonteCarlo(Integer(round(N * 3)))
    simulation_method_2 = SubSetSimulation(Integer(round(N)), 0.1, 10, Normal())
    simulation_method_3 = SubSetInfinity(Integer(round(N)), 0.1, 10, 0.5)

    ###
    #   Simulate
    ###
    pf_1[i], cov_1[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_1)
    pf_2[i], cov_2[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_2)
    pf_3[i], cov_3[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_3)

end



plot(N_samples, pf_1)
plot!(N_samples, pf_2)
plot!(N_samples, pf_3)
plot!(N_samples, ones(length(N_samples)) * True_failure_prob,  xaxis=:log10)


plot(N_samples, cov_1)
plot!(N_samples, cov_2)
plot!(N_samples, cov_3, xaxis=:log10)

ERROR: LoadError: DomainError with -0.0004979094436062081:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(f::Symbol, x::Float64)
   @ Base.Math ./math.jl:33
 [2] sqrt
   @ ./math.jl:677 [inlined]
 [3] estimate_cov(Iᵢ::BitMatrix, pf::Float64, n::Int64)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:293
 [4] probability_of_failure(models::Model, performancefunction::typeof(limitstate), inputs::Vector{RandomVariable}, sim::SubSetInfinity)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:130
 [5] top-level scope
   @ ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/demo/Damask/example_convergance/example_convergance.jl:58
 [6] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [7] top-level scope
   @ REPL[19]:1
in expression starting at

Return cov for probability of failure methods

All the pf methods should return the cov as a second output. For now this includes:

Monte Carlo
LineSampling
SubsetSimulation

Improve PCE code

Some of the PCE code can be significantly improved. By using better approaches to building the basis functions, extra calls to Symbolics.float etc. can be removed.

Having the basis functions as proper Julia functions called with (x) also looks great.

HaltonSampling broken with latest QuasiMonteCarlo version

In 0.2.10 the interface for LowDiscrepancySampling (Halton) was changed to include randomization.

The corresponding qmc_sample must be updated and the compat entry increased to 0.2.10.

Allow passing of finite difference method for gradients and FORM

Currently, forward differences are used everywhere with a fixed degree. To be more flexible we should allow to specify the exact method used. This will also allow for use of all other parameters such as accounting for noise or different tolerances.

Can probably be a keyword for the gradients and property of the FORM struct.

gradient(model, inputs, :y; fdm = central_fdm(5, 1, max_range=9e-1))

form = FORM(10, 1e-4; fdm = central_fdm(5, 1, max_range=9e-1))

Global sensitivity upper bound / local sensitivity monte carlo

In the OpenCossan code there are several references to equations without giving the paper they are refering to...

Hybrid metamodel reliability model updating stuff?

Reliability based optimization

Finish the ABO implementation?

Implement Adaptive Line Sampling

This is quite weird in OpenCossan and I am not 100% how it works. We should try this directly from the paper and see how it works.

Error when using FORM with external solver

ERROR: LoadError: MethodError: no method matching iterate(::ExternalModel)
Closest candidates are:
  iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
  iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
  iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
  ...
Stacktrace:
 [1] probability_of_failure(models::ExternalModel, performance::Function, inputs::Vector{RandomVariable}, sim::FORM)
   @ UncertaintyQuantification ~/.julia/packages/UncertaintyQuantification/RAsRu/src/reliability/form.jl:20
 [2] top-level scope
   @ ~/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51
in expression starting at /Users/angray/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51

This was fixed from the user side by adding the external solver into an array:

pf, β, dp = probability_of_failure([ext], limitstate, inputs, FORM())

PCE multiple UQModel

The function polynomialchaos cannot accept multiple models and/or single input. Would be nice to let the function accept a vector of UQmodel and have convenienve methods for that and single UQInput

First-order reliability method (FORM)

There is already a basic implementation of the Hasofer-Lind Reliability Index as FORM in the form branch. I don't remember how well this actually performs. It needs to be updated and properly tested.

We need unit tests and a demo script with a suitable example.

System reliability

Make the repo citable

At some point, probably after we added a bunch of neat features, we should get a DOI for the repository to allow ourselves and hopefully others to cite it when it is used for a publication.

Github has a guide here: https://guides.github.com/activities/citable-code/

Subset in physical space

Bayesian updating

The pull request #63 needs a unified interface and should probably use some of the methods from Ander's TransitionalMCMC.jl

Imprecise probabilities

Big discussion for the future

Delete ExternalModel output files

evaluate! function applied to ExternalModel (e.g. finite element models) could create a lot of unnecessary output files.
This files can easily saturate ROM when long simulations are performed.
A cleanup flag should be implemented for deleting the whole folder containing the samples of one simulation, after the extraction of all usefull values.

Subset-∞

The algorithm generates the conditional samples for the next level by representing each random variable by an arbitrary number of hidden variables instead of using Markov Chain Monte Carlo.

Reference: Patelli & AU (2015) Efficient Monte Carlo algorithm for rare failure event simulation

Stochastic Processes

Maybe using TimeSeries.jl?

Is it possible to put this into dataframes?

ExternalModel - Absolute Path is required

Using relative path for workdir and solvername will cause issues during model evaluation (line 23 solver.jl). Would be convenient to have a check on the paths when model is built.

Multiple ExternalModel

AS IS:
When running evaluate! with more than 1 ExternalModel they store thier results in 2 different folder based on their timestamp. If the second ExternalModel needs the output of the first model is not possible to use the cleanup flag.

TO BE:
All the ExternalModels store result in the same directory. If the cleanup flag is step to true, all the outputs in the result folder are deleted only after the last Model run.

Drop HaltonSequence dependency

QuasiMonteCarlo.jl includes LowDiscrepancySample which returns a Halton sequence for d > 1 or the 1-dimensional van der Corbut sequence.

The code can be updated to use the QMC package for HaltonSampling so that we can drop the dependency on HaltonSequences.

The package seems to be more active as well and it being under the SciML umbrella gives me confidence, that it is the better dependency for this.

It doesn't seem to skip numbers by default as it does for others, e.g. Sobol. Something we could potentially add in the future to reduce dependencies in higher dimensions.

We should make sure the results are exactly the same just in case there is something different I can't see at first glance.

Effect of transformations on QMC nets

To make sure everything were doing in regards to QMC is correct we need to do a bit of research and literature review. Mostly I'm interested in:

What happens to the nets sampled using QMC methods under transformation from [0,1]^d to the marginal distributions of our random variables
- As an example, in 2d, do samples of the Sobol' sequence retain their properties when transformed to standard normal random variables?
What happens once dependencies such as copulas or muiltivariate normals are involved?
- Does QMC even make sense in these cases?
Are all the transformations we are making valid, e.g. going from [0,1] to the marginals through the quantile function?

Convenience methods for distributions from mean and standard deviation

It is more common to have the mean and standard deviation of a random quantity (e.g., obtained from experiments or data sheests) than the parameters of a distribution. Thus, it would be nice to have a function that gets the mean, standard deviation and distribution type and returns the distribution parameters in a tuple.

As a proof of concept, see the functions used to create a Gumbel and a Lognormal distribution.

function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.Gumbel})
	β = std * sqrt(6) / pi
	μ = mean - MathConstants.γ * β
	return μ, β
end

function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.LogNormal})
	μ = log(mean^2/sqrt(std^2+mean^2))
	σ = sqrt(log(std^2/mean^2+1))
	return μ, σ
end

Gumbel(distribution_parameters(50, 5, Gumbel)...)

LogNormal(distribution_parameters(28.8e4-19.9e4,2.64e4, LogNormal)...)

Design of Experiments

Now that the response surface was implemented in #87 we should also add the most used design of experiments such as:

Box-Behnken
Central-Composite
Full factorial
Two-level factorial

ExperimentalDesign.jl could be a solution but it has a lot of dependencies and currently set the compat for Julia to 1.8.

Create necessary inputs

Apart from RandomVariableSet what other inputs do we need?

RandomVariable: No need to wrap the objects from Distributions.jl.
Parameter: This could potentially just be any subtype of Number and just be filled into the DataFrame
Function: If we do this, it should probably just take a simple lambda function that evaluates one variable and adds it to the DataFrame. Alternatively this could be omitted and people would need to just include it in their model/solver. Might be nice though for intermediate values or something you might want in your end results.

Should we have a common type for everything that can be an input? This type could be sub-typed together with what we need from Distributions.jl.

Implement Important Sampling with Design Point

Sliced distributions

Separate package but wrapped in here.

Register package in General

I will register the package in the General repository of Julia to reserve the name.

Tasks to be done before registering the package as 0.1.0:

Merge line sampling
Add license (#5)
Update README.md with a list of things currently available and some other info.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Improve Subset efficiency

Currently the performance of samples that are rejected is evaluated again instead of reused from the previous step. By returning accepted from candidatesamples we can split the dataframe and avoid repeated evaluations.

Subset in high dimension to reduce alpha rejection

See paper by Zuev

Metamodels

IPM
Kriging
Response surface
Neural networks?

friesischscott / uncertaintyquantification.jl Goto Github PK