jason-ash / pyesg Goto Github PK

View Code? Open in Web Editor NEW

123.0 123.0 32.0 700 KB

Economic scenario generator for python: simulate stocks, interest rates, and other stochastic processes.

License: MIT License

Python 100.00%

actuarial financial-modeling open-source python stochastic-processes

pyesg's People

Contributors

Stargazers

Watchers

pyesg's Issues

validate dimension of input arrays to StochasticProcess methods

#44 removed the JointStochasticProcess class and consolidated everything into the StochasticProcess abstract base class. I added a dim parameter to this class that is currently unused. I think this parameter should be used (at least) to validate the x0 arrays that are passed to the class methods, and also to ensure the correlation matrix (if required) is the correct size.

Use xarray to store scenario output

Example:

>>> from scipy import stats
>>> import xarray as xr
>>> out = xr.DataArray(
...     data=stats.norm.rvs(size=(100, 30*12+1, 4), random_state=42),
...     coords=[range(100), range(361), ['SPX', 'RTY', 'NDX', 'EST']],
...     dims=['scenario', 'step', 'index']
... )
>>> idx = dict(scenario=0, index='SPX')
>>> out.loc[idx]
<xarray.DataArray (step: 361)>
array([ 0.496714, -0.234153, -0.469474, ...,  0.860473, -0.248691,  0.662881])
Coordinates:
    scenario  int32 0
  * step      (step) int32 0 1 2 3 4 5 6 7 8 ... 353 354 355 356 357 358 359 360
    index     <U3 'SPX'

Base class for DiffusionProcess

Most stochastic models will probably fall under a base class we could call DiffusionProcess. Would provide shared functionality for any stochastic model, such as Vasicek, CIR, Hull-White, etc.

class DiffusionProcess:
    """Base class for stochastic simulation models"""

Open question: will there need to be a distinction between models that have a defined transition density vs. ones that don't?

mypy type standardization

In #30 I've noticed that the generic Union[float, np.ndarray] is becoming less accurate. While most functions should be able to accept list-like or float inputs, I think all functions should probably return numpy arrays as a default.

Skewness feature?

I found this package when asking Cursor about esg models that can generate customized skewness stock returns, it responded:

`from pyesg import ESG

params = {
'n_scenarios': 1000,
'n_years': 10,
'dt': 1/12,
'r0': 0.02,
'vol': 0.01,
'mean_reversion': 0.1,
'lambda_': 0.1,
'sigma': 0.01,
'rho': 0.5,
'skew': -0.5
}
esg = ESG('GBM', params)
`

Looks like in some version, this package can generate GBM with customized skewness, but I could not find it in the current version.

Is there any solution?

Handle time-varying model coefficients

For example, in the Hull-White model.

input validation for StochasticProcess

StochasticProcess should accept x0 as lots of types: float, int, list[int], list[float], and np.ndarray. Add a method that can handle all of these and return an np.ndarray instead, so the methods can uniformly operate on known numpy arrays and return numpy arrays as outputs.

Accidental hard coded dt value

The dt argument in self.standard_deviation here is set as a fixed 1.0, rather than being the variable dt. Simple fix. Also should add some comments about what exactly is happening in this einsum.

pyesg/pyesg/stochastic_process.py

Line 149 in 6310b3a

dx = np.einsum("ab,acb->ac", rvs, self.standard_deviation(x0, 1.0))

Built-in plotting

Hey,

do you mean by 'built-in plotting' to simply call a method that can plot results obtained from a stochastic process (such as Geometric Brownian Motion)?

I have conducted extensive outlier analysis of stochastic simulations in life insurance and could imagine that for instance an automatic plotting of the mean, quantiles and some other key information (or moments) could indeed make using the library more useful.

What do you think? I am new to contributing and I would love to be able to add something useful to this package.

Kind regards,
Andreas

doc for GBM

One question,

pyesg/pyesg/processes/geometric_brownian_motion.py

Line 10 in c364ebb

Geometric Brownian Motion: dX = X*exp((μ - δ - (1/2)*σ**2)dt + σdW)

Looks like you are trying to apply sth like eq(7) of http://www.frouah.com/finance%20notes/Euler%20and%20Milstein%20Discretization.pdf?
If so, the formula of GBM (that you are using in the code) maybe sth like:
Geometric Brownian Motion: X_{k+1} = X_k*exp((μ - δ - (1/2)*σ**2)dt + σdW)?

And do you have any suggested documentation for the Euler scheme you are using (like what is the apply function, etc)?

Thanks!

Results not being produced

Hi Jason - this is a newbie question. I'm trying to run the following code in Jupyter notebook but I'm not getting any output...it just sits there like I never pressed run.

import pyesg

model = pyesg.GeometricBrownianMotion(mu = 0.05, sigma = 0.2)

x0 = 100
dt = 1
n_scenarios = 1000
n_steps = 60
random_state = 123

results = model.scenarios(x0, dt, n_scenarios, n_steps, random_state)

I've tried other code to test there's nothing else wrong but I don't think there is. For example, when I run...

import pyesg

mu = 0.05
sigma = 0.2
x0 = 100
dt = 10

pyesg.GeometricBrownianMotion(mu, sigma).expectation(x0, dt)

...I get:
array([134.98588076])

Can you help?!

Wijdan

Error in 2019 US3M Treasuries

I noticed a decimal shift error in 2019 3-month US treasuries within ust_historical_csv. Updated these 12 values and added 2020 UST.
ust_historical.zip

Parallelize stochastic process

Currently the StochasticProcess classes process a single timestep at once. They should be able to parallelize these steps across any number of scenarios, so we can run N scenarios very quickly.

To implement this, we should find a way to accept an array of start values and process each of their steps simultaneously. This is similar to the way scipy.stats distributions can handle multidimensional arrays to produce distribution metrics.

REAMDE API/examples are out of date

Hey,

Is it possible that the pypi version does not match the repo. The README.md example doesn't work and the installed package (0.1.1) look different.

Add coefficient validation method to each model

Populate this validation method for each model, checking

coefficient types
coefficient shapes (if arrays)
coefficient value ranges (e.g. positive-only, or relational constraints as in the CIR model.)
matrix properties (e.g. for correlation matrices)

Then ensure the _validate_coefs is called at the appropriate time before other methods, e.g. sample are able to be called.

pyesg/pyesg/diffusion_process.py

Lines 92 to 94 in ddde7ec

 def _validate_coefs(self) -> None: 

 """Validates shape, type, and ranges of coefficients for the process""" 

 raise NotImplementedError()

Use Hypothesis library for better tests

Improve test coverage by using the hypothesis library.

Handle two-factor models

For example, the two-factor Vasicek model or the two-factor Hull-White model.

Add fit method to each model

Allow stochastic models to be fit with historical data. This seems like a BIG task, but important.

Ideas that may be relevant:

use MLE where appropriate, or possible
use the "Exact Algorithm"
add a _transition_density property to each model, where it exists

pyesg/pyesg/diffusion_process.py

Lines 96 to 113 in ddde7ec

 def fit(self, X: np.ndarray, y: np.ndarray): 

 """ 

  Fits the parameters of the diffusion process based on historical data. 

  The exact method of fitting should be defined at the subclass level, because the 

  implementation can vary depending on the model. 

  Parameters 

  ---------- 

  X : np.ndarray, the indices of times/dates of the observed prices 

  y : np.ndarray, the observed prices or values on the given dates. If multiple 

  indices, then y will be a matrix, where the columns are the indices. 

  Returns 

  ------- 

  self 

  """ 

 raise NotImplementedError()

Handle multiple indices for sampling

Extend the current implementation of a DiffusionProcess to be able to simulate multiple indices (e.g. S&P 500, Russell 2000, etc.).

This probably means the signature of the sampling algorithm might need to change to include an extra parameter: n_indices.

Model.scenarios not producing output

Hi Jason - this is a newbie question. I'm trying to run the following code in Jupyter notebook but I'm not getting any output...it just sits there like I never pressed run.

import pyesg

model = pyesg.GeometricBrownianMotion(mu = 0.05, sigma = 0.2)

x0 = 100
dt = 1
n_scenarios = 1000
n_steps = 60
random_state = 123

results = model.scenarios(x0, dt, n_scenarios, n_steps, random_state)

I've tried other code to test there's nothing else wrong but I don't think there is. For example, when I run...

import pyesg

mu = 0.05
sigma = 0.2
x0 = 100
dt = 10

pyesg.GeometricBrownianMotion(mu, sigma).expectation(x0, dt)

...I get:
array([134.98588076])

Can you help?! I'm quite new to Python so I'm pretty sure I'm just make an obvious mistake somewhere.

Wijdan

Incorrect application of min/max in AcademyRateProcess

pyesg/pyesg/processes/academy_rate_process.py

Lines 185 to 186 in 312c144

 out[0] = min(self.long_rate_max - x0[0], out[0]) 

 out[0] = max(self.long_rate_min - x0[0], out[0])

I think this is incorrect. out[0] here is the change in log value of x0[0], but then we apply the min and max as nominal rates. This doesn't feel right to me - should be corrected, and tests added to confirm the floors and caps work as expected.

Tutorial?

Hey,

would it potentially make sense to include a more thorough tutorial on how to make use of the existing stochastic processes that you have already defined?

Kind regards,
Andreas

Add plotting method to DiffusionProcess

Would be nice to be able to plot scenarios for a given model.

to_array should be able to convert tuple to numpy array

pyesg/pyesg/utils.py

Line 34 in 44cfded

def to_array(value: Array) -> np.ndarray:

Should probably make sense to be able to convert tuples to np.arrays as well.

Is the dataset value correct ?

the 3Month value seems 100x greater ?

Implement Cox-Ingersoll-Ross model

Cox-Ingersoll-Ross

import numpy as np

def cir(r, k, theta, sigma, dt):
    return r + k*(theta - r) + sigma*r**0.5*dt**0.5*np.random.randn()

I tried to generate scenarios using GeometricBrownianMotion: Throws an Error

Hi Jason,
I am Shiva and I am working on scenario generation on solar PV generation forecast to address uncertainty between the actual and prediction.

I tried to use "pyesg" - Geometric Brownian Motion to generate scenerios.

I am using google colab environment for my research work and I installed the package using "!pip install pyesg"

I tried your example in my google colab space and it throws error. Please advice me to solve the issue.

I have attached screen shot below,

	def _validate_coefs(self) -> None:
	"""Validates shape, type, and ranges of coefficients for the process"""
	raise NotImplementedError()

	def fit(self, X: np.ndarray, y: np.ndarray):
	"""
	Fits the parameters of the diffusion process based on historical data.

	The exact method of fitting should be defined at the subclass level, because the
	implementation can vary depending on the model.

	Parameters
	----------
	X : np.ndarray, the indices of times/dates of the observed prices
	y : np.ndarray, the observed prices or values on the given dates. If multiple
	indices, then y will be a matrix, where the columns are the indices.

	Returns
	-------
	self
	"""
	raise NotImplementedError()

	out[0] = min(self.long_rate_max - x0[0], out[0])
	out[0] = max(self.long_rate_min - x0[0], out[0])

jason-ash / pyesg Goto Github PK

pyesg's People

Contributors

Stargazers

Watchers

Forkers

pyesg's Issues

Recommend Projects

Recommend Topics

Recommend Org