Git Product home page Git Product logo

pyfpt's Introduction

PyPI version License example workflow DOI

What is PyFPT?

PyFPT is Python/Cython package to run first-passage time (FPT) simulations using importance sampling. An FPT problem is about finding the time taken to cross some threshold during a stochastic process.

This package will let you numerically investigate the tail of the probability density for first passage times for a general 1D Langevin equation.

The tail of the probability density is investigated using the method of importance sampling, where a bias increases the probability of large FPTs, resulting in a sample distribution, which are then weighted to reproduce the rare events of the target distribution. This allows very rare events (normally needing supercomputers) to be simulated efficiently with just your laptop!

Note, it was originally developed to find the local number of e-folds in slow-roll stochastic inflation. As such, analytical functionality is also included for this particular problem in the analytics module.

(back to top)

Documentation

You can find the latest documentation on PyFPT's ReadTheDocs page.

Requirements

Operating System

As PyFPT uses Cython to optimise the stochastic simulations, a C-compilier is required for installation. Therefore, PyFPT does not currently run (future releases hope to address this issue) on Windows directly. Windows uses can either install PyFPT on a virtual machine or use a cloud-based service such as SciServer.

Mac and Linux user should be able to directly install PyFPT, as these operating systems have a C-compiler. Do feel free to raise an issue or contact us if you have any problems.

Packages

The following packages are required to run PyFPT

Many of which are included in common Python distributions like Anaconda. You can check which packages you already have installed with pip list.

(back to top)

Getting Started

User Guide

The documentation contains a user guide, whose code you can run yourself as interactive Jupyter notebook by downloading them.

Installation

The package can be installed by using the following command

pip install PyFPT

in the command line wherever you have Python installed.

You can also clone the PyFPT repository

git clone https://github.com/Jacks0nJ/PyFPT.git

to work on it locally. This would require compiling the Cython code (the .pyx files) locally as well.

(back to top)

Example Results

The PyFPT package can be used to investigate far into the tail of the probability density (down to 10^-34 and beyond!)

Or even deviations from Gaussianity!

In the above images `N' is the first-passage time in stochastic inflation.

See the user guides for details on how you can make these figures yourself!

(back to top)

Unit Testing

PyFPT uses the unittest module to maintain the code. Almost all functions have some form of basic unit testing, which hopefully will be further developed as the project continues. The tests can all be found in the tests folder.

If unittest is installed, then the tests can be run locally using

pytest -v

This tests the functions which have been installed using pip. The easiest way to run the test suite on any modified functions is to upload to your branch to the repo, as (the uploaded) tests run every commit on the code uploaded.

(back to top)

Roadmap

  • Simulate first-passage times of slow-roll inflation
  • Use importance sampling to investigate rare realisations.
  • Make general, for any 1D Langevin equation
  • Add multi-dimensionality
    • Add the acceleration of the field
    • Add more sophisticated noise models

See the open issues for a full list of known issues.

(back to top)

Development branches

As different contributors continue to development the code, they will do so in several different branches. Therefore, it cannot be guaranteed that any branch, other than the main, will be fully functional at any one time. The main branch will be the correct release of the code available on PyPI and what you will install using pip.

Contributing

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

And we will review your request!

(back to top)

Bugs

This is the initial release of PyFPT, so it is expected there will be some minor bugs. Feel free to either report by raising an Issue on Github, emailing [email protected] or fork the repository with your fix.

Your feedback is very much appreciated!

License

Distributed under an Apache-2.0 License. See LICENSE.txt for more information.

(back to top)

Contact

Joe Jackson - [email protected]

Project Link: https://github.com/Jacks0nJ/PyFPT

(back to top)

Acknowledgments

We would like the following contributors to PyFPT, be it through physical understanding of first-passage time processes or help developing the package

The Physics

  • David Wands
  • Vincent Vennin
  • Kazuya Koyama
  • Hooshyar Assadullahi

Package Development

  • Coleman Krawczyk
  • Ian Harry

Logo

  • Will Jackson

Resources

The following resoucres were instrumental in developing the project into a package usable by the community:

(back to top)

pyfpt's People

Contributors

danielskatz avatar github-actions[bot] avatar jacks0nj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pyfpt's Issues

[JOSS Review] User Guide 2: Understanding the Data

Re JOSS #4509

Running the second user guide notebook, cell 9 aborts with

for i, bias_amp in enumerate(bias_amp_range):
    print('Now simulating for bias amplitude '+str(bias_amp))
    bin_centres, heights, errors =\
        fpt.numerics.is_simulation(drift_func, diffusion_func, phi_in, phi_end,
                                   num_runs, bias_amp, dN, save_data=True,
                                   estimator='lognormal', bins=num_bins)
    
    # Storing the results in NumPy arrays
    bin_centres_storage[:len(bin_centres),i] = bin_centres
    heights_storage[:len(heights),i] = heights
    # Remeber this is a 3D array
    errors_storage[i,:,:len(heights)] = errors
/opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/numpy/lib/function_base.py:2830: RuntimeWarning: invalid value encountered in true_divide
  c /= stddev[None, :]

Saved data to file IS_data_x_in_6.48_iterations_200000_bias_0.0.csv

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [9], in <cell line: 1>()
      1 for i, bias_amp in enumerate(bias_amp_range):
      2     print('Now simulating for bias amplitude '+str(bias_amp))
      3     bin_centres, heights, errors =\
----> 4         fpt.numerics.is_simulation(drift_func, diffusion_func, phi_in, phi_end,
      5                                    num_runs, bias_amp, dN, save_data=True,
      6                                    estimator='lognormal', bins=num_bins)
      8     # Storing the results in NumPy arrays
      9     bin_centres_storage[:len(bin_centres),i] = bin_centres

File /opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/pyfpt/numerics/is_simulation.py:229, in is_simulation(drift, diffusion, x_in, x_end, num_runs, bias, time_step, bins, min_bin_size, num_sub_samples, estimator, save_data, t_in, t_f, x_r)
    224         save_data_to_file(fpt_values, w_values, x_in, num_runs,
    225                           bias(x_in, 0), extra_label='_custom_bias')
    227 # Now analysisng the data to creating the histogram/PDF data
    228 bin_centres, heights, errors, num_runs_used, bin_edges_untruncated =\
--> 229     data_points_pdf(fpt_values, w_values, estimator, bins=bins,
    230                     min_bin_size=min_bin_size,
    231                     num_sub_samples=num_sub_samples)
    232 # Return data as lists
    233 return bin_centres.tolist(), heights.tolist(), errors.tolist()

File /opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/pyfpt/numerics/data_points_pdf.py:121, in data_points_pdf(data, weights, estimator, bins, min_bin_size, num_sub_samples)
    119 if estimator == 'naive':
    120     heights = heights_raw/histogram_norm
--> 121     errors = jackknife_errors(data, weights, bins, num_sub_samples)
    122     if isinstance(min_bin_size, int) is True:
    123         heights = heights[filled_bins]

File /opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/pyfpt/numerics/jackknife_errors.py:53, in jackknife_errors(data_input, weights_input, bins, num_sub_samps)
     49 height_array = np.zeros((num_bins, num_sub_samps))  # Storage
     51 # Next organise into subsamples
     52 data =\
---> 53     np.reshape(data, (int(data.shape[0]/num_sub_samps), num_sub_samps))
     54 weights =\
     55     np.reshape(weights, (int(weights.shape[0]/num_sub_samps),
     56                          num_sub_samps))
     58 # Find the heights of the histograms, for each sample

File <__array_function__ internals>:180, in reshape(*args, **kwargs)

File /opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/numpy/core/fromnumeric.py:298, in reshape(a, newshape, order)
    198 @array_function_dispatch(_reshape_dispatcher)
    199 def reshape(a, newshape, order='C'):
    200     """
    201     Gives a new shape to an array without changing its data.
    202 
   (...)
    296            [5, 6]])
    297     """
--> 298     return _wrapfunc(a, 'reshape', newshape, order=order)

File /opt/anaconda3/envs/fpt_review/lib/python3.10/site-packages/numpy/core/fromnumeric.py:57, in _wrapfunc(obj, method, *args, **kwds)
     54     return _wrapit(obj, method, *args, **kwds)
     56 try:
---> 57     return bound(*args, **kwds)
     58 except TypeError:
     59     # A TypeError occurs if the object does have such a method in its
     60     # class, but its signature is not identical to that of NumPy's. This
   (...)
     64     # Call _wrapit from within the except clause to ensure a potential
     65     # exception has a traceback chain.
     66     return _wrapit(obj, method, *args, **kwds)

ValueError: cannot reshape array of size 199968 into shape (9998,20)

Small typo in the MD cell after input [5]: "Now do let's do a 2HD histogram as a colour map."

error in ws

When the bias is very large and the code probes the far tail, the weights are cast to complex numbers as strings for an unknown reason. There are also nans in the data set when this has occurred. Related?

The exact error was "float() argument must be a string or a real number, not 'complex'" for the heights storage.

Make a check to stop the data analysis when this occurs.

See Parth's file.

[JOSS Review] Comments on User Guide 3 notebook

Re openjournals/joss-reviews#4607

This is a very nice notebook as it clearly demonstrates PyFPT's limitations and points to ways how to overcome these. I found parts of it a bit hard to understand and a number of typos, as listed below.

I don't make a pull request as merging notebooks can be nasty.

  • "the code has flagged that using the ๐‘-values that the lognormal assumption maybe incorrect." -> the code has flagged the distribution as possibly not being lognormal according to the $p$ values.
  • "as we already have the data saved to" -> as we have already saved the data to
  • "re-analysis it" -> re-analyse it
  • "But one can find that the exponential tail shows to the next-to-leading order pole using the advanced root-finding and analysis techniques of" -> I find this sentence hard to understand. Is the "to" in "... shows to the ..." correct?
  • Can you provide a link to the mathematica notebook where this point is analysed? Some readers may be interested. Also, please give the version and a citation/reference link for mathematica.
  • "the scatter of data points is much greater than when bias_amp=0.8" -> the scatter of data points is much greater than with bias_amp=0.8
  • "Second, the 2D histogram shows the spread of N to larger values is much worse than bias=0.8" "Second, the 2D histogram shows the spread of N to larger values is much worse than with bias=0.8"
  • Mathmatica -> Mathematica (see also comment above)
  • "This is case suffers the" "This is case suffers from the "
  • Third, the variance of the weights ๐‘ค within each bin increases with T also increases at an even greater rate than the finite ๐œ™UV case. -> Third, the variance of the weights ๐‘ค within each bin increases with N at an even greater rate than the finite ๐œ™UV case.
  • editting -> editing the
  • is used a wrapper -> is used as a wrapper
  • But if one is interested in analysissing the tail of a particular inflation model, a clone of PyFPT can be made and edited it to contain drift and diffusion as ... -> But if one is interested in analysing he tail of a particular inflation model, clone of PyFPT corresponding Cython functions can be added to fpt.numerics.importance_sampling_cython.pyx, in a cloned PyFPT repository (or a fork of the latter).
  • This change will to at least a 2x speed up -> This can speed up calculations by a factor 2 or more.
  • such that region of interest is investigated. -> such that the region of interest is investigated.
  • l technocal -> technical

[JOSS Review] Testing

Re JOSS #4509

Testing

I could not find instructions for how to run tests. I found that running

$> pytest -v

in the root dir works, though.

tests/test_is_simulation.py::TestIS_Simulation::test_is_simulation
generates two matplolib plot windows that need to be closed manually before the tests continue. I find that slightly annoying as tests cannot be run automatically. As I'd like to recommended adding the tests to a CI setup (e.g. github-actions), this would have to be addressed. One solution could be to check if the test is run interactively or within a CI automated setup and have the test generate these plots or not.

I suggest adding information about how to run tests to the README and documentation.

Also, there seems to be some coverage testing functionality as indicated by lines 16,17 in setup.cfg, would be great to learn how to run coverage testing and get a coverage report. I assume coverage is great (looking at the impressive number of test modules), but I'd like to see the actual numbers ;)

Error in User Guide 1

HI @Jacks0nJ I am one of the reviewers for the JOSS paper.

Working through User Guide 1 (https://pyfpt.readthedocs.io/en/latest/User_guide1.html), the following raises an error:

>>> gaussian_pdf = fpt.analytics.gaussian_pdf(V, V_dif, V_ddif, phi_in, phi_end)
>>> plt.errorbar(bin_centres, heights, yerr=errors, fmt=".", ms=7,
...              label='{0}'.format(r'Importance Sample'))
>>> plt.plot(bin_centres, gaussian_pdf(bin_centres),
...          label='{0}'.format('Gaussian'), color='k')
>>> # Need to use log scale to see data in the far tail
>>> plt.yscale('log')
>>> plt.xlabel(r'$\mathcal{N}$')
>>> plt.ylabel(r'$P(\mathcal{N})$')
>>> plt.legend()

Similarly with the Edgeworth series.

This is because the following doesn't work as bin_centres is a list:

>>> gaussian_pdf(bin_centres)

This is fixed by:

>>> [gaussian_pdf(b) for b in bin_centres]

or by:

>>> import numpy as np
>>> gaussian_pdf(np.array(bin_centres))

More checks

Add a check to make sure the returned heights, bin centres and errors are all of the same length.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.