friesischscott / uncertaintyquantification.jl Goto Github PK
View Code? Open in Web Editor NEWUncertainty Quantification in Julia
License: MIT License
Uncertainty Quantification in Julia
License: MIT License
In 0.2.10
the interface for LowDiscrepancySampling
(Halton) was changed to include randomization.
The corresponding qmc_sample
must be updated and the compat entry increased to 0.2.10
.
Using relative path for workdir
and solvername
will cause issues during model evaluation (line 23 solver.jl
). Would be convenient to have a check on the paths when model is built.
Apart from RandomVariableSet
what other inputs do we need?
RandomVariable
: No need to wrap the objects from Distributions.jl
.Parameter
: This could potentially just be any subtype of Number
and just be filled into the DataFrame
Function
: If we do this, it should probably just take a simple lambda
function that evaluates one variable and adds it to the DataFrame
. Alternatively this could be omitted and people would need to just include it in their model/solver. Might be nice though for intermediate values or something you might want in your end results.Should we have a common type
for everything that can be an input? This type could be sub-typed together with what we need from Distributions.jl
.
Importance sampling where a hyper sphere with radius
Currently when an external solver encounters an error, Julia will get stuck in the working directory of the current sample. This creates issues when the source paths etc. are defined using pwd()
and joinpath()
.
The pull request #63 needs a unified interface and should probably use some of the methods from Ander's TransitionalMCMC.jl
In the OpenCossan code there are several references to equations without giving the paper they are refering to...
The function polynomialchaos
cannot accept multiple models and/or single input. Would be nice to let the function accept a vector of UQmodel
and have convenienve methods for that and single UQInput
At some point, probably after we added a bunch of neat features, we should get a DOI for the repository to allow ourselves and hopefully others to cite it when it is used for a publication.
Github has a guide here: https://guides.github.com/activities/citable-code/
ERROR: LoadError: MethodError: no method matching iterate(::ExternalModel)
Closest candidates are:
iterate(::Union{LinRange, StepRangeLen}) at range.jl:872
iterate(::Union{LinRange, StepRangeLen}, ::Integer) at range.jl:872
iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
...
Stacktrace:
[1] probability_of_failure(models::ExternalModel, performance::Function, inputs::Vector{RandomVariable}, sim::FORM)
@ UncertaintyQuantification ~/.julia/packages/UncertaintyQuantification/RAsRu/src/reliability/form.jl:20
[2] top-level scope
@ ~/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51
in expression starting at /Users/angray/Documents/neutronics/variance_reduction_ND/example/UQ_jl/FORM.jl:51
This was fixed from the user side by adding the external solver into an array:
pf, β, dp = probability_of_failure([ext], limitstate, inputs, FORM())
The evaluation of sobols indices for multiple external models creates different folders (based on the different timestamps of each model evaluation).
Regarding the overall structure I think what we should do is something like this:
model
connects several inputs
to one or multiple solvers?
.solvers
are executed in order. The most basic solver being a julia
function that you pass in. Somewhat like the Mio
we have in OpenCossan
.DataFrames
. A Solver would just run whatever it needs to and adds the results to the DataFrame
of the inputs. The end result would then collect all inputs and output values for whatever one might want to do with them afterwards.a = 5
b = Normal()
function add (df::DataFrame)
df.c = df.a .+ df.b
return df
end
solver = Solver(add)
model = Model([a b], ["a" "b"], [solver])
result = run(model, samples=1000)
Something like this, only more complex 🤔.
This could then be extended to include ObjectiveFunctions
for reliability analysis or advanced sampling algorithms, etc.
This is probably related to the correlation estimation of the markov chains. For 1 rv we can most likely fall back to the standard MC way of computing the pf.
Currently, forward differences are used everywhere with a fixed degree. To be more flexible we should allow to specify the exact method used. This will also allow for use of all other parameters such as accounting for noise or different tolerances.
Can probably be a keyword for the gradients and property of the FORM
struct.
gradient(model, inputs, :y; fdm = central_fdm(5, 1, max_range=9e-1))
form = FORM(10, 1e-4; fdm = central_fdm(5, 1, max_range=9e-1))
AS IS:
When running evaluate!
with more than 1 ExternalModel
they store thier results in 2 different folder based on their timestamp. If the second ExternalModel
needs the output of the first model is not possible to use the cleanup
flag.
TO BE:
All the ExternalModel
s store result in the same directory. If the cleanup
flag is step to true
, all the outputs in the result folder are deleted only after the last Model
run.
evaluate!
function applied to ExternalModel (e.g. finite element models) could create a lot of unnecessary output files.
This files can easily saturate ROM when long simulations are performed.
A cleanup flag should be implemented for deleting the whole folder containing the samples of one simulation, after the extraction of all usefull values.
Currently we have Sobol and Halton from two different packages. It might be a good idea to switch to QuasiMonteCarlo.jl and implement everything it offers.
QMC.jl includes:
I vaguely remember an issue with creating high dimensional LH samples in this but I am not sure.
Finish the ABO implementation?
UnivariateDistribution
better represents what the RandomVariable
is supposed to be and is significantly easier to understand.
See paper by Zuev
Separate package but wrapped in here.
QuasiMonteCarlo.jl includes LowDiscrepancySample
which returns a Halton sequence for d > 1
or the 1-dimensional van der Corbut sequence.
The code can be updated to use the QMC package for HaltonSampling so that we can drop the dependency on HaltonSequences.
The package seems to be more active as well and it being under the SciML umbrella gives me confidence, that it is the better dependency for this.
It doesn't seem to skip numbers by default as it does for others, e.g. Sobol. Something we could potentially add in the future to reduce dependencies in higher dimensions.
We should make sure the results are exactly the same just in case there is something different I can't see at first glance.
To make sure everything were doing in regards to QMC is correct we need to do a bit of research and literature review. Mostly I'm interested in:
[0,1]^d
to the marginal distributions of our random variables
[0,1]
to the marginals through the quantile function?Currently the performance of samples that are rejected is evaluated again instead of reused from the previous step. By returning accepted
from candidatesamples
we can split the dataframe and avoid repeated evaluations.
We should add a simple model which is run in parallel (pmap
) when additional workers are present. This is just a convenience model for when the model itself can't be expressed in a vectorized way.
It is more common to have the mean and standard deviation of a random quantity (e.g., obtained from experiments or data sheests) than the parameters of a distribution. Thus, it would be nice to have a function that gets the mean, standard deviation and distribution type and returns the distribution parameters in a tuple.
As a proof of concept, see the functions used to create a Gumbel and a Lognormal distribution.
function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.Gumbel})
β = std * sqrt(6) / pi
μ = mean - MathConstants.γ * β
return μ, β
end
function distribution_parameters(mean::Real, std::Real, d::Type{Distributions.LogNormal})
μ = log(mean^2/sqrt(std^2+mean^2))
σ = sqrt(log(std^2/mean^2+1))
return μ, σ
end
Gumbel(distribution_parameters(50, 5, Gumbel)...)
LogNormal(distribution_parameters(28.8e4-19.9e4,2.64e4, LogNormal)...)
Currently, there are only tests for unrandomized qmc. Tests for randomized Sobol-, Halton- and Lattice Rule Sampling will need to be implemented.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
The algorithm generates the conditional samples for the next level by representing each random variable by an arbitrary number of hidden variables instead of using Markov Chain Monte Carlo.
Reference: Patelli & AU (2015) Efficient Monte Carlo algorithm for rare failure event simulation
Using QMC to compute sobol indices yields very different results based on the method.
using UncertaintyQuantification
a = Parameter(7, :a)
b = Parameter(0.1, :b)
x = RandomVariable.(Uniform(-π, π), [:x1, :x2, :x3])
inputs = [a, b, x...]
ishigami = Model(
df -> sin.(df.x1) + df.a .* sin.(df.x2) .^ 2 + df.b .* df.x3 .^ 4 .* sin.(df.x1),
:f,
)
n = 10000
s_mc = sobolindices([ishigami], inputs, :f, MonteCarlo(n))
s_ss = sobolindices([ishigami], inputs, :f, SobolSampling(n))
s_hs = sobolindices([ishigami], inputs, :f, HaltonSampling(n))
s_lhs = sobolindices([ishigami], inputs, :f, LatinHypercubeSampling(n))
Just comparing the first order indices:
[0.3038509495659772, 0.4531211968732472, -0.004581507395695991]
(Monte Carlo)[0.0259111519070014, 0.4290425029042932, 0.21306096653679016]
(Sobol)[0.011524353591864793, 0.3774794907941725, 3.188982809095301e-5]
(Halton)[0.3114654127984809, 0.43890237167581103, 0.0005408956632518976]
(LatinHypercube)Monte Carlo and LatinHypercube are able to appoximate the analytical solution while Sobol and Halton are nowhere close.
Since LatinHyperCube, Sobol and Halton all go through the same sampling code, it can't be an issue there.
@AnderGray Any idea? I'm at a loss here.
Performing convergence study for MC, subset, and subset infinity.
Sometimes (randomly) estimate_cov
is trying to take the sqrt of a complex number. Script for the study
using UncertaintyQuantification, DelimitedFiles, Formatting
using Distributions, Plots
###
# Define input random variables
###
X1 = RandomVariable(Normal(0, 1), :X1)
X2 = RandomVariable(Normal(0, 1), :X2)
inputs = [X1, X2]
###
# Define function, and model
###
function y(df)
return df.X1 .+ df.X2
end
m = Model(y, :y)
###
# Define limitstate. Failure is limitstate <= 0
###
function limitstate(df)
return 4 .- reduce(vcat, df.y)
end
###
# True failure for this problem
###
True_failure_prob = 1 - cdf(Normal(), 4/sqrt(2))
N_samples = 10 .^range(2, 5, 50)
pf_1 = zeros(length(N_samples))
pf_2 = zeros(length(N_samples))
pf_3 = zeros(length(N_samples))
cov_1 = zeros(length(N_samples))
cov_2 = zeros(length(N_samples))
cov_3 = zeros(length(N_samples))
for (i, N) in enumerate(N_samples)
println(i)
###
# Try different simulation methods
###
simulation_method_1 = MonteCarlo(Integer(round(N * 3)))
simulation_method_2 = SubSetSimulation(Integer(round(N)), 0.1, 10, Normal())
simulation_method_3 = SubSetInfinity(Integer(round(N)), 0.1, 10, 0.5)
###
# Simulate
###
pf_1[i], cov_1[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_1)
pf_2[i], cov_2[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_2)
pf_3[i], cov_3[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_3)
end
plot(N_samples, pf_1)
plot!(N_samples, pf_2)
plot!(N_samples, pf_3)
plot!(N_samples, ones(length(N_samples)) * True_failure_prob, xaxis=:log10)
plot(N_samples, cov_1)
plot!(N_samples, cov_2)
plot!(N_samples, cov_3, xaxis=:log10)
ERROR: LoadError: DomainError with -0.0004979094436062081:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
[1] throw_complex_domainerror(f::Symbol, x::Float64)
@ Base.Math ./math.jl:33
[2] sqrt
@ ./math.jl:677 [inlined]
[3] estimate_cov(Iᵢ::BitMatrix, pf::Float64, n::Int64)
@ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:293
[4] probability_of_failure(models::Model, performancefunction::typeof(limitstate), inputs::Vector{RandomVariable}, sim::SubSetInfinity)
@ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:130
[5] top-level scope
@ ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/demo/Damask/example_convergance/example_convergance.jl:58
[6] include(fname::String)
@ Base.MainInclude ./client.jl:478
[7] top-level scope
@ REPL[19]:1
in expression starting at
Using https://github.com/davidavdav/GaussianMixtures.jl maybe?
Mappings between standard normal and physical space will not exist and the methods should be implemented to throw errors when called.
All the pf
methods should return the cov
as a second output. For now this includes:
Some of the PCE
code can be significantly improved. By using better approaches to building the basis functions, extra calls to Symbolics.float
etc. can be removed.
Having the basis functions as proper Julia functions called with (x)
also looks great.
Now that the response surface was implemented in #87 we should also add the most used design of experiments such as:
ExperimentalDesign.jl could be a solution but it has a lot of dependencies and currently set the compat for Julia to 1.8.
Since v0.3
QuasiMonteCarlo.jl includes various randomization methods for all its sampling methods. We should enable these as well, with sane defaults since some of the sequences should generally not be used unrandomized.
Big discussion for the future
Maybe using TimeSeries.jl?
Is it possible to put this into dataframes?
The last QMC method I'd like to have implemented is the one based on rank-1 lattice rules. Implementation should be straight forward in line with the other QMC methods.
s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())
Tests should be added to test the first few values like with the other methods.
This is quite weird in OpenCossan and I am not 100% how it works. We should try this directly from the paper and see how it works.
Now that 1.7 is released and 1.6 has become the new lts we should update the julia compatability to 1.6 and 1.7. Tests should also run on both versions. From now on I think it makes sense to always test on the lts and the latest stable.
There is already a basic implementation of the Hasofer-Lind Reliability Index as FORM
in the form
branch. I don't remember how well this actually performs. It needs to be updated and properly tested.
We need unit tests and a demo script with a suitable example.
We need to choose a license for the project. GNU GPLv3 or MIT?
I will register the package in the General repository of Julia to reserve the name.
Tasks to be done before registering the package as 0.1.0
:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.