friesischscott / uncertaintyquantification.jl Goto Github PK

View Code? Open in Web Editor NEW

27.0 6.0 7.0 7.42 MB

Uncertainty Quantification in Julia

License: MIT License

Julia 100.00%

julia uncertainty-quantification reliability-analysis monte-carlo-simulation metamodeling sensitivity-analysis

uncertaintyquantification.jl's Issues

HaltonSampling broken with latest QuasiMonteCarlo version

In 0.2.10 the interface for LowDiscrepancySampling (Halton) was changed to include randomization.

The corresponding qmc_sample must be updated and the compat entry increased to 0.2.10.

ExternalModel - Absolute Path is required

Using relative path for workdir and solvername will cause issues during model evaluation (line 23 solver.jl). Would be convenient to have a check on the paths when model is built.

Create necessary inputs

Apart from RandomVariableSet what other inputs do we need?

RandomVariable: No need to wrap the objects from Distributions.jl.
Parameter: This could potentially just be any subtype of Number and just be filled into the DataFrame
Function: If we do this, it should probably just take a simple lambda function that evaluates one variable and adds it to the DataFrame. Alternatively this could be omitted and people would need to just include it in their model/solver. Might be nice though for intermediate values or something you might want in your end results.

Should we have a common type for everything that can be an input? This type could be sub-typed together with what we need from Distributions.jl.

Implement Radial Based Importance Sampling

Importance sampling where a hyper sphere with radius $\beta$ is excluded. Here $\beta$ is the reliabiliy index, i.e., the distance to the design point in standard normal space.

CD back to the old pwd when the external solver fails

Currently when an external solver encounters an error, Julia will get stuck in the working directory of the current sample. This creates issues when the source paths etc. are defined using pwd() and joinpath().

Bayesian updating

The pull request #63 needs a unified interface and should probably use some of the methods from Ander's TransitionalMCMC.jl

Global sensitivity upper bound / local sensitivity monte carlo

In the OpenCossan code there are several references to equations without giving the paper they are refering to...

Add missing simulations

We should implement more simulations for sampling and other basic tasks. This includes other Quasi Monte Carlo Simulations apart from Sobol and Halton Sampling.

Tasks:

PCE multiple UQModel

The function polynomialchaos cannot accept multiple models and/or single input. Would be nice to let the function accept a vector of UQmodel and have convenienve methods for that and single UQInput

Make the repo citable

At some point, probably after we added a bunch of neat features, we should get a DOI for the repository to allow ourselves and hopefully others to cite it when it is used for a publication.

Github has a guide here: https://guides.github.com/activities/citable-code/

Error when using FORM with external solver

ERROR: LoadError: MethodError: no method matching iterate(::ExternalModel)
Closest candidates are:
  iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
  iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
  iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
  ...
Stacktrace:
 [1] probability_of_failure(models::ExternalModel, performance::Function, inputs::Vector{RandomVariable}, sim::FORM)
   @ UncertaintyQuantification ~/.julia/packages/UncertaintyQuantification/RAsRu/src/reliability/form.jl:20
 [2] top-level scope
   @ ~/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51
in expression starting at /Users/angray/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51

This was fixed from the user side by adding the external solver into an array:

pf, β, dp = probability_of_failure([ext], limitstate, inputs, FORM())

Sobols Indices with multiple ExternalModel BUG

The evaluation of sobols indices for multiple external models creates different folders (based on the different timestamps of each model evaluation).

Solvers and Models

Regarding the overall structure I think what we should do is something like this:

A model connects several inputs to one or multiple solvers?.
The solvers are executed in order. The most basic solver being a julia function that you pass in. Somewhat like the Mio we have in OpenCossan.
Everything just passes around DataFrames. A Solver would just run whatever it needs to and adds the results to the DataFrame of the inputs. The end result would then collect all inputs and output values for whatever one might want to do with them afterwards.

(Pseudo-)Example:

a = 5
b = Normal()

function add (df::DataFrame) 
    df.c = df.a .+ df.b
    return df
end

solver = Solver(add)

model = Model([a b], ["a" "b"], [solver])

result = run(model, samples=1000)

Something like this, only more complex 🤔.

This could then be extended to include ObjectiveFunctions for reliability analysis or advanced sampling algorithms, etc.

SubSet returns NaN cov for 1 rv

This is probably related to the correlation estimation of the markov chains. For 1 rv we can most likely fall back to the standard MC way of computing the pf.

Allow passing of finite difference method for gradients and FORM

Currently, forward differences are used everywhere with a fixed degree. To be more flexible we should allow to specify the exact method used. This will also allow for use of all other parameters such as accounting for noise or different tolerances.

Can probably be a keyword for the gradients and property of the FORM struct.

gradient(model, inputs, :y; fdm = central_fdm(5, 1, max_range=9e-1))

form = FORM(10, 1e-4; fdm = central_fdm(5, 1, max_range=9e-1))

Multiple ExternalModel

AS IS:
When running evaluate! with more than 1 ExternalModel they store thier results in 2 different folder based on their timestamp. If the second ExternalModel needs the output of the first model is not possible to use the cleanup flag.

TO BE:
All the ExternalModels store result in the same directory. If the cleanup flag is step to true, all the outputs in the result folder are deleted only after the last Model run.

Delete ExternalModel output files

evaluate! function applied to ExternalModel (e.g. finite element models) could create a lot of unnecessary output files.
This files can easily saturate ROM when long simulations are performed.
A cleanup flag should be implemented for deleting the whole folder containing the samples of one simulation, after the extraction of all usefull values.

Implement Quasi Monte Carlo Methods

Currently we have Sobol and Halton from two different packages. It might be a good idea to switch to QuasiMonteCarlo.jl and implement everything it offers.

QMC.jl includes:

GridSample(dx) where the grid is given by lb:dx[i]:ub in the ith direction.
UniformSample for uniformly distributed random numbers.
SobolSample for the Sobol sequence.
LatinHypercubeSample for a Latin Hypercube.
LatticeRuleSample for a randomly-shifted rank-1 lattice rule.
LowDiscrepancySample(base) where base[i] is the base in the ith direction.

I vaguely remember an issue with creating high dimensional LH samples in this but I am not sure.

Reliability based optimization

Finish the ABO implementation?

Replace Sampleable{Univariate} with UnivariateDistribution

UnivariateDistribution better represents what the RandomVariable is supposed to be and is significantly easier to understand.

Subset in high dimension to reduce alpha rejection

See paper by Zuev

Sliced distributions

Separate package but wrapped in here.

Drop HaltonSequence dependency

QuasiMonteCarlo.jl includes LowDiscrepancySample which returns a Halton sequence for d > 1 or the 1-dimensional van der Corbut sequence.

The code can be updated to use the QMC package for HaltonSampling so that we can drop the dependency on HaltonSequences.

The package seems to be more active as well and it being under the SciML umbrella gives me confidence, that it is the better dependency for this.

It doesn't seem to skip numbers by default as it does for others, e.g. Sobol. Something we could potentially add in the future to reduce dependencies in higher dimensions.

We should make sure the results are exactly the same just in case there is something different I can't see at first glance.

Effect of transformations on QMC nets

To make sure everything were doing in regards to QMC is correct we need to do a bit of research and literature review. Mostly I'm interested in:

What happens to the nets sampled using QMC methods under transformation from [0,1]^d to the marginal distributions of our random variables
- As an example, in 2d, do samples of the Sobol' sequence retain their properties when transformed to standard normal random variables?
What happens once dependencies such as copulas or muiltivariate normals are involved?
- Does QMC even make sense in these cases?
Are all the transformations we are making valid, e.g. going from [0,1] to the marginals through the quantile function?

Improve Subset efficiency

Currently the performance of samples that are rejected is evaluated again instead of reused from the previous step. By returning accepted from candidatesamples we can split the dataframe and avoid repeated evaluations.

Simple ParallelModel

We should add a simple model which is run in parallel (pmap) when additional workers are present. This is just a convenience model for when the model itself can't be expressed in a vectorized way.

Convenience methods for distributions from mean and standard deviation

It is more common to have the mean and standard deviation of a random quantity (e.g., obtained from experiments or data sheests) than the parameters of a distribution. Thus, it would be nice to have a function that gets the mean, standard deviation and distribution type and returns the distribution parameters in a tuple.

As a proof of concept, see the functions used to create a Gumbel and a Lognormal distribution.

function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.Gumbel})
	β = std * sqrt(6) / pi
	μ = mean - MathConstants.γ * β
	return μ, β
end

function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.LogNormal})
	μ = log(mean^2/sqrt(std^2+mean^2))
	σ = sqrt(log(std^2/mean^2+1))
	return μ, σ
end

Gumbel(distribution_parameters(50, 5, Gumbel)...)

LogNormal(distribution_parameters(28.8e4-19.9e4,2.64e4, LogNormal)...)

Randomized QMC: tests

Currently, there are only tests for unrandomized qmc. Tests for randomized Sobol-, Halton- and Lattice Rule Sampling will need to be implemented.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Subset-∞

The algorithm generates the conditional samples for the next level by representing each random variable by an arbitrary number of hidden variables instead of using Markov Chain Monte Carlo.

Reference: Patelli & AU (2015) Efficient Monte Carlo algorithm for rare failure event simulation

Sobol indices and QMC

Using QMC to compute sobol indices yields very different results based on the method.

using UncertaintyQuantification

a = Parameter(7, :a)
b = Parameter(0.1, :b)

x = RandomVariable.(Uniform(-π, π), [:x1, :x2, :x3])

inputs = [a, b, x...]

ishigami = Model(
    df -> sin.(df.x1) + df.a .* sin.(df.x2) .^ 2 + df.b .* df.x3 .^ 4 .* sin.(df.x1),
    :f,
)

n = 10000

s_mc = sobolindices([ishigami], inputs, :f, MonteCarlo(n))
s_ss = sobolindices([ishigami], inputs, :f, SobolSampling(n))
s_hs = sobolindices([ishigami], inputs, :f, HaltonSampling(n))
s_lhs = sobolindices([ishigami], inputs, :f, LatinHypercubeSampling(n))

Just comparing the first order indices:

[0.3038509495659772, 0.4531211968732472, -0.004581507395695991] (Monte Carlo)
[0.0259111519070014, 0.4290425029042932, 0.21306096653679016] (Sobol)
[0.011524353591864793, 0.3774794907941725, 3.188982809095301e-5] (Halton)
[0.3114654127984809, 0.43890237167581103, 0.0005408956632518976] (LatinHypercube)

Monte Carlo and LatinHypercube are able to appoximate the analytical solution while Sobol and Halton are nowhere close.

Since LatinHyperCube, Sobol and Halton all go through the same sampling code, it can't be an issue there.

@AnderGray Any idea? I'm at a loss here.

estimate_cov for subset and subset infinity sometimes returns complex number

Performing convergence study for MC, subset, and subset infinity.

Sometimes (randomly) estimate_cov is trying to take the sqrt of a complex number. Script for the study

using UncertaintyQuantification, DelimitedFiles, Formatting
using Distributions, Plots

###
#   Define input random variables
###
X1 = RandomVariable(Normal(0, 1), :X1)
X2 = RandomVariable(Normal(0, 1), :X2)

inputs = [X1, X2]

###
#   Define function, and model
###
function y(df)
    return df.X1 .+ df.X2
end

m = Model(y, :y)

###
#   Define limitstate. Failure is limitstate <= 0
###
function limitstate(df)
    return 4 .- reduce(vcat, df.y)
end

###
#   True failure for this problem
###
True_failure_prob = 1 - cdf(Normal(), 4/sqrt(2))

N_samples = 10 .^range(2, 5, 50)

pf_1 = zeros(length(N_samples))
pf_2 = zeros(length(N_samples))
pf_3 = zeros(length(N_samples))

cov_1 = zeros(length(N_samples))
cov_2 = zeros(length(N_samples))
cov_3 = zeros(length(N_samples))


for (i, N) in enumerate(N_samples)
    println(i)
    ###
    #   Try different simulation methods
    ###
    simulation_method_1 = MonteCarlo(Integer(round(N * 3)))
    simulation_method_2 = SubSetSimulation(Integer(round(N)), 0.1, 10, Normal())
    simulation_method_3 = SubSetInfinity(Integer(round(N)), 0.1, 10, 0.5)

    ###
    #   Simulate
    ###
    pf_1[i], cov_1[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_1)
    pf_2[i], cov_2[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_2)
    pf_3[i], cov_3[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_3)

end



plot(N_samples, pf_1)
plot!(N_samples, pf_2)
plot!(N_samples, pf_3)
plot!(N_samples, ones(length(N_samples)) * True_failure_prob,  xaxis=:log10)


plot(N_samples, cov_1)
plot!(N_samples, cov_2)
plot!(N_samples, cov_3, xaxis=:log10)

ERROR: LoadError: DomainError with -0.0004979094436062081:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(f::Symbol, x::Float64)
   @ Base.Math ./math.jl:33
 [2] sqrt
   @ ./math.jl:677 [inlined]
 [3] estimate_cov(Iᵢ::BitMatrix, pf::Float64, n::Int64)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:293
 [4] probability_of_failure(models::Model, performancefunction::typeof(limitstate), inputs::Vector{RandomVariable}, sim::SubSetInfinity)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:130
 [5] top-level scope
   @ ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/demo/Damask/example_convergance/example_convergance.jl:58
 [6] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [7] top-level scope
   @ REPL[19]:1
in expression starting at

Gaussian mixture models

Using https://github.com/davidavdav/GaussianMixtures.jl maybe?

Mappings between standard normal and physical space will not exist and the methods should be implemented to throw errors when called.

System reliability

Implement Important Sampling with Design Point

Return cov for probability of failure methods

All the pf methods should return the cov as a second output. For now this includes:

Monte Carlo
LineSampling
SubsetSimulation

Improve PCE code

Some of the PCE code can be significantly improved. By using better approaches to building the basis functions, extra calls to Symbolics.float etc. can be removed.

Having the basis functions as proper Julia functions called with (x) also looks great.

Design of Experiments

Now that the response surface was implemented in #87 we should also add the most used design of experiments such as:

Box-Behnken
Central-Composite
Full factorial
Two-level factorial

ExperimentalDesign.jl could be a solution but it has a lot of dependencies and currently set the compat for Julia to 1.8.

Randomized quasi Monte Carlo

Since v0.3 QuasiMonteCarlo.jl includes various randomization methods for all its sampling methods. We should enable these as well, with sane defaults since some of the sequences should generally not be used unrandomized.

DigitalShift
MatousekScramble
OwenScramble
Shift

Imprecise probabilities

Big discussion for the future

Hybrid metamodel reliability model updating stuff?

Stochastic Processes

Maybe using TimeSeries.jl?

Is it possible to put this into dataframes?

Implement LatticeRuleSampling

The last QMC method I'd like to have implemented is the one based on rank-1 lattice rules. Implementation should be straight forward in line with the other QMC methods.

s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())

Tests should be added to test the first few values like with the other methods.

Metamodels

IPM
Kriging
Response surface
Neural networks?

Merge line sampling
Add license (#5)
Update README.md with a list of things currently available and some other info.

friesischscott / uncertaintyquantification.jl Goto Github PK

uncertaintyquantification.jl's Issues

(Pseudo-)Example:

Recommend Projects

Recommend Topics

Recommend Org