The imcmcrun from chaoshunh

IMCMCrun is used to perform a seismic inversion using a fast eikonal solver and an algorithm based on interactive markov chains. (See the article Bottero, Alexis, et al. "Stochastic seismic tomography by interacting Markov chains." Geophysical Journal International 207.1 (2016): 374-392).
Author : Alexis Bottero (alexis Dot bottero At gmail Dot com). Feel free to contact me for any question. 
Source code (c/c++/fortran) can be found in directory src. This program include wavelet library : Wavelib. 
This program can be run in parallel on a CPU cluster. Directory utils contains some useful python scripts to watch the results of an inversion.

PREREQUISITES
=============

_C++/Fortran compilers : GNU (g++/gfortran) or Intel (icc/ifort) 
_For parallel usage : openmpi
_For the python scripts to analyse the results : python2 with modules numpy,matplotlib,argparse,glob,mpl_toolkits
(Normaly included with all Debian distributions)
_fftw3 http://www.fftw.org/download.html. This program is needed. It is easy to install (read INSTALL file). Just 
mind the place where you install it. By default it is installed in /usr/local/ (/usr/local/bin, /usr/local/lib, /usr/local/include and
/usr/local/share) but you can specified an other directory when configuring by using --prefix:
./configure --prefix=/absolute/path/to/the/directory/you/want
_gmp https://gmplib.org. Download it, go to parent directory and type: 
   ./configure --prefix=/usr
	 sudo make
	 sudo make check
   sudo make install
_THEN install MPFR http://www.mpfr.org/.Download it, go to parent directory and type:
   curl http://www.mpfr.org/mpfr-3.1.3/allpatches | patch -N -Z -p1
   ./configure --prefix=/usr
	 sudo make
	 sudo make check
   sudo make install
   
GETTING STARTED 
===============

1.Edit the Makefile:
     _FFTW3_INCLUDE=/path/to/fftw3/install/directory/include     # default /usr/local/include
     _FFTW3_LIB=/path/to/fftw3/install/directory/lib             # default /usr/local/lib
     _Mind linker flags to handle Fortran/C++ instructions : _For GNU MPI compilers -lgfortran
                                                             _For INTEL compilers -lifcore
                                                      
2.Go to examples/HomogeneousSynthetic

3.Edit configExample.cfg, this file contains all the information on this example run (have a look at it! convenient 
  highlighting : Ruby)
     _Change the line : DATA_DIRECTORY = /absolute/path/to/IMCMCrun/examples/HomogeneousSynthetic/dataExample/

4.Edit run_that_example.sh. Have a look to that script as well, it will compile the program and run the example.
     _MPI mode : MPI='yes' (otherwise 'no') and set the number of processes in NPROC=...

5. Execute the file run_that_example.sh (./run_that_example.sh). This will appears :

    Starting Program...

    ========================================================
                     IMPORTANCE RESAMPLING MCMC             
    ========================================================
    This is the run number 999                                  --> This code is generated randomly : remember it

    ...
       the program will run for a while...
    ...
    
6. At the end to see the results of the run (replace 999 by the code of your run) :
    ../../utils/watchResults.py ../../OUTPUT_FILES/999/ -a


MANUAL
======

This part just contains few information. Please read carefully the HomogeneousSynthetic/configExample.cfg as it contains details
on the different options.

Max and min values are computed from iteration TRESH=50 for now. This value can be modified in src/define.h

The wavelet transforms are performed on profiles of the same size than the first guess file(s)

Velocity model on file :     velocity model calculated by FTeik :

 z    velocity
                                  z=0  ------------
2.625   4125.32                            4125.32
                               z=5.25  ------------      
7.875   4112.56                            4112.56
                               z=10.5  ------------
13.125  4116.12     ------>                4116.12
                              z=15.75  ------------
18.375  4128.23                            4128.23
                               z=21.0  ------------
23.625  4124.78                            4124.78
    ...                       z=26.25  ------------
    ...                              ...
All z borders are automaticaly calculated by the program
    
As filtered profiles are of lower frequency, we can remove points before performing the eikonal without changing the result. 
This is done as below :  !! Warning in c++ indices start at 0 (to nz-1) !! 

  z[1]   -        zFilt[1]     -            ^
        dz                  dzFilt          |
  z[2]   -                                  |
                  zFilt[2]     -            |
  z[3]   -   --->                           |
                                            |
  z[4]   -        zFilt[3]     -            | zmax-zmin = (nz-1)*dz = (nzFilt-1)*dzFilt   =>   dzFilt = dz*(nz-1)/(nzFilt-1)
                                            |
  z[5]   -                                  |
                                            |
   ...                ...                   |
                                            |
  z[nz] -        zFilt[nzFilt] -            v
 
 
 The value of the velocity in each interval of the filtered profile is interpolated.

PARAMETERIZATION
================

The objective of the program is to explore the parameter space: {P=(p0,p1,...,pN)}. These parameters describing a velocity profile:

                                               velocity profile = parameterization(P)

There are two possibilities in this program.
  _If waveletParameterization is set to 0 the program uses a layer based parameterization:
      If SWAVES calculated:
        P=(vp0,vp1, ... ,vpNPU-1,vs0,vs1,...,vsNPU-1) --> Hence the dimension of the problem is: 2*NPU
      Otherwise:
        P=(vp0,vp1, ... ,vpNPU-1) --> Hence the dimension of the problem is: NPU
      Where NPU is given in the configuration file.

      The setting is as following:

      ZTOP ---------------  (ZTOP is given, it is not a parameter)
               vp0,vs0
       z1  ---------------
               vp1,vs1
       z2  ---------------
               vp2,vs2
       z3  ---------------
                 ...
    zNPU-1 ---------------
             vpNPU-1,vsNPU-1
   zBOTTOM ---------------  (ZBOTTOM > ZTOP is given, it is not a parameter)

   z1,z2... are calculated from ZTOP,ZBOTTOM and NPU

  _If waveletParameterization is set to 1 the program uses a parameterization based on wavelet transform. In this cases P is a set
   of wavelet coefficients, the profile is obtained by inverse wavelet transform:
      If SWAVES calculated:
        P=(coeffsP0,coeffsP1, ... ,coeffsPNPU-1,coeffsS0,coeffsS1, ... ,coeffsSNPU-1)  --> Hence the dimension of the problem is: 2*NPU
      Otherwise:
        P=(coeffsP0,coeffsP1, ... ,coeffsPNPU-1)  --> Hence the dimension of the problem is: NPU
   Each wavelet coefficients is associated with an index. These indices are determined by filtering the first guess velocity profiles.
   Then -> if KEEP_FIRST_VALUES = 1 we keep the indices of the NPU biggest coefficients
        -> if KEEP_FIRST_VALUES = 0 we keep the NPU first indices
 
FILES CREATED BY THE PROGRAM
============================
I is the index of the chain at temperature TI

_Prior features file : if minParameters[i] = maxParameters[i] for a given i the parameter will be considered as static. We do not
inverse it (it is not a real parameter actually)

_Times files * (the second column is needed if we calculate S waves (flag SWAVES)): 
  timeP(shot 1,receiver 1) timeS(shot 1,receiver 1)
  timeP(shot 1,receiver 2) timeS(shot 1,receiver 2)
  ...
  timeP(shot 1,receiver NR) timeS(shot 1,receiver NR)
  timeP(shot 2,receiver 1) timeS(shot 2,receiver 1)
  ...
  timeP(shot 2,receiver NR) timeS(shot 2,receiver NR)
  timeP(shot 3,receiver 1) timeS(shot 3,receiver 1)
  ...
  ...
  timeP(shot 1,receiver 1) timeS(shot 1,receiver 1) 
  
  ->Negative times are ignored

XXX are 3 digit that are different for each run. They are used distinguish files from different runs

_calculatedTimes.XXX.dat : Created by analytical run, contains the times calculated from the real model given (same structure than *)

_chainI.XXX.dat : param1 | param2 | param3 | ... | paramN | Energy   |
                  param1 | param2 | param3 | ... | paramN | Energy   | steps
                  param1 | param2 | param3 | ... | paramN | Energy   v
                                    ...
_statsI.XXX.dat : at | rt | od | ps | as | rs | acceptance probability | swapping probability
       at : Number of accepted MH transitions
       rt : Number of rejected MH transitions
       od : Number of "out of domain"s
       ps : Number of proposed swaps
       as : Number of accepted swaps
       rs : Number of rejected swaps

_ll.XXX.dat : N | i | sp
       i  : Index of the chain whose state has been swapped
       N  : Index of the state of the chain i that have been swapped 
       sp : Index of the state of the chain i+1 that has been chosen

_sci.XXX.dat : Importance Weights matrix 

_averagePI.XXX.dat : average P wave velocity with depth generated by chain I

_averageSI.XXX.dat : average S wave velocity with depth generated by chain I

_varPI.XXX.dat : variance of P wave velocity with depth for chain I

_varSI.XXX.dat : variance of S wave velocity with depth for chain I

_qInfPI.XXX.dat : inferior value of the quantile calculated for P wave velocities and chain I

_qInfSI.XXX.dat : inferior value of the quantile calculated for S wave velocities and chain I

_qSupPI.XXX.dat : superior value of the quantile calculated for P wave velocities and chain I

_qSupSI.XXX.dat : superior value of the quantile calculated for S wave velocities and chain I

_bestModelTimes.XXX.dat : times corresponding to the best model (the origin time of the shots are not taken into account)

TODO finish that

TIPS
====

_If you want to run the program again you don't need to compile it again, just edit the file run_that_example.sh : MAKEALL='no'
_To delete all file created but not OUTPUT_FILES : "make clean", to delete everything "make purge"
_If there is a bug the first thing to do is to set VERBOSE1=1 and VERBOSE2 = 1 in .cfg file
_If you stop a run before it ends you can still analyse the first results with utils/watchResults.py
_If you want to run EXACTLY the same example (to calculate more iterations for example) let say it is the run number XXX. Copy the
seed at the beginning of the file config.XXX.dat then use the same cfg file but turning USE_DEFAULT_SEED to 1 and pasting the seed
in DEFAULT_SEED. The parameters of the run must remain the same (but you can display more details, write files...)
                               
WARNINGS
========

_If you perform an ANALYTICAL_RUN, the file NAME_OF_REAL_PROFILE_FILE_P (and NAME_OF_REAL_PROFILE_FILE_S) must be of the form :
z | v(z) with a step fixed (i.e z[1]-z[0] = z[2]-z[1] = ...)
_Do not put source too close from z borders there is no security for now (this might cause segmentation faults in the eikonal solver)
_The grid is automaticaly set from the position of sources/receivers. The margin in driven by COORD_TOL.
_If by chance two runs have the same key XXX the last one will erase the previous one (1 chance over 1000 but it happens! Apply make purge often)
_Z must point downwards (...z[2]>z[1]>z[0])
_Coordinates must be >> TINYVAL (defined in src/define.h).
_In the .cfg file "NC = 4,4," for example -> will produce a bug (we would have to trim useless comas)
_We need a dyadic (2^N : 32,64,128,256...) number of points to describe the first guess or real velocity model curves 
 (to make the wavelet transform)

TODO
====

How to make this code better : _Improve this README file
                               _Add a comment line at the beginning of each file written explaining what it is
                               _Improve eik3d
                               _Put TRESH (files_and_control) in .cfg file
                               _NXREF NYREF for find optimum grid !!
                               _Remove system commands
                               _Replace exit commands by custom_exit (create that function taking MPI into account. It could create a file. Put the "terminating" messages on it)
                               _Add options in the Makefile and put *genmod* files in obj
                               _verify that dz is constant
                               _Continue the function : checkConfig (filesAndControl.cpp)
                               _Ctrl+f TODO in all files
                               _Parallelize the function createDataset (initializations.cpp)
                               _Make first guess curves optionnal
                               _Parallelize on chains
                               _Add an estimation of the final time (function printEvolution)
                               _While we give Configuration* to every function we should create a Singleton
                               _To avoid if(config->verbose1 && config->mpiConfig.rank == 0) everywhere we might redefine std::cout?
                               _Use FTeik2D for 2D problems and a simple calculation (t=d/v) for 1D problems.
                               _Write "Mean differences at the receivers between big run and that run for shot number "<< shotNumber << " : "  << averageDiffs[min_index] on a file
                               _Improve findOptimumGrid and buildPrior
                               _Test if first point of z is negative
                               _Complete config file       
                               _In build prior calculate Tmax (add an option: automaticaly calculate Tmax)
                               _Script redoThatRun.sh XXX -> copy files, par_file, seed...
                               _Write all files with c++ fashion (filesAndControl.cpp)
                               _Record min and max velocity values after a given iteration (not only from the beginning)
                               _Add a burn-in
                               _Implement the small algo of Thomas to avoid the out-of-domain, see utils (we simulate in a tore)
chaoshunh / imcmcrun Goto Github PK

imcmcrun's Introduction

imcmcrun's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent