aangelopoulos / conformal-time-series Goto Github PK

Conformal prediction for time-series applications.

License: MIT License

Python 4.86% Shell 0.54% Jupyter Notebook 93.90% TeX 0.71%

conformal conformal-prediction control control-systems control-theory controls time-series time-series-analysis uncertainty uncertainty-quantification

conformal-time-series's Introduction

Conformal PID Control for Time-Series Prediction

This repository is about producing prediction sets for time series.

The methods here are guaranteed to have coverage for any, possibly adversarial sequence. We take a control systems outlook on performing this task, introducing a method called Conformal PID Control.

Several methods are implemented herein, including online quantile regression (quantile tracking/P control), adaptive conformal prediction, and more!

This codebase makes it easy to extend the methods/add new datasets. We will describe how to do so below.

Getting Started

To reproduce the experiments in our paper, clone this repo and run the following code from the root directory.

conda create --name pid
pip install -r requirements.txt
cd tests
bash run_tests.sh
bash make_plots.sh

The one exception is the COVID experiment. For that experiment, you must first run the jupyter notebook in conformal-time-series/tests/datasets/covid-ts-proc/statewide /death-forecasting-perstate-lasso-qr.ipynb. It requires the deaths.csv data file, which you can download from this Drive link.

Adding New Methods

Step 1: Defining the method.

The core/methods.py file contains all methods. Consider the following method header, for the P controller/quantile tracker, as an example:

def quantile(
    scores,
    alpha,
    lr,
    ahead,
    proportional_lr=True,
    *args,
    **kwargs
):

The first three arguments, scores, alpha, and lr, are required arguments for all methods. The first argument, scores, expects a numpy array of conformal scores. The second argument, alpha, is the desired miscoverage. Finally, the third argument, lr, is the learning rate. (In our paper, this is $\eta$, and in the language of control, this is $K_p$.)

The rest of the arguments listed are required arguments specific to the given method. The argument ahead determines how many steps ahead the prediction is made --- for example, if ahead=4, that means we are making 4-step-ahead predictions (one step is defined by the resolution of the input array scores). The function of *args and **kwargs is to allow methods to take arguments given in a dictionary form.

All methods should return a dictionary of results that includes the method name and the sequence of $q_{t}$. In the quantile example case, the dictionary should look like the following, where qs is a numpy array of quantiles the same length as scores: results = {"method": "Quantile", "q" : qs} Methods that follow this formatting will be able to be processed automatically by our testing infrastructure.

Step 2: Edit the config to include your method.

Tl;Dr: go to each config in tests/configs, and add a line under methods: for each method you want to run, along with what learning rates to test. The below example, from tests/configs/AMZN.yaml, will ask the testing suite to run the quantile tracker on the Amazon stock price dataset with five different learning rate choices.

  Quantile:
    lrs:
      - 1
      - 0.5
      - 0.1
      - 0.05
      - 0

As background, this is part of our little testing infrastructure for online conformal. The infrastructure spawns a parallel process for every dataset, making it efficient to test one method on all datasets with only one command (the command to run the tests is bash run_tests.sh, and to plot the results is bash make_plots.sh).

The infrastructure works like this.

The user defines a file in tests/configs/ describing an experiment, i.e., a dataset name and a combination of methods and settings for each method to run.
The script tests/run_tests.sh calls tests/base_test.py on every .yaml file in the tests/configs directory.
The script tests/make_plots.sh calls inset_plot.py

base_plots.py

Step 3: Edit `base_test.py` to include your method.

The code in line 5 of base_test.py imports all the methods --- import yours as well. Then add your method to the big if/else block starting on line 103.

Adding New Datasets

Step 1: Load and preprocess the dataset.

First, download your dataset and put it in tests/datasets. Then, edit the tests/datasets.py file to add a name for your dataset and some processing code for it. The dataset should be a pandas dataframe with a valid datetime index (it has to be evenly spaced, and correctly formatted with no invalid values), and at least one column simply titled y. This column represents the target value.

Alternatively, including a column titled forecasts or scorecasts will cause the infrastructure to use these forecasts/scorecasts instead of the ones it would have produced on its own. This is useful if you have defined a good forecaster/scorecaster outside our framework, and you just want to use our code to run conformal on top of that. Extra columns can be used to add information for more complex forecasting/scorecasting strategies.

Step 2: Create a config file for the dataset.

As mentioned above, a config file should be made for each dataset, describing what methods should be run with what parameters. The example of tests/configs/AMZN.yaml can be followed.

After executing these two steps, you should be able to run python base_test.py configs/your_dataset.yaml and the results will be computed! Alternatively, you can just execute the bash scripts.

Workarounds for Known Bugs

On M1/M2 Mac, in order to use Prophet, follow the instructions at this link: https://github.com/facebook/prophet/issues/2250.

conformal-time-series's People

Contributors

Stargazers

Watchers

Forkers

valeman evelynmitchell penghaojiang albert2020-05 mzhao98 rishirelan

conformal-time-series's Issues

ARIMA models specifically hitting with LU decomposition error using darts package!

Hi aangleopoulos ,
I was testing my dataset with this framework with exogenous variables but specifically for ARIMA model using darts package I was hitting issue concerning to LU decomposition error even while setting univariate target variable in the middle of generating forecasts. whereas for the rest of the dart forecasting models the forecast is working fine .

Error code snippet:-

Generating forecasts...
C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\base\model.py:607: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
warnings.warn("Maximum Likelihood optimization failed to "
19%|█████████████████████████▏ | 243/1266 [04:13<17:48, 1.04s/it]
Traceback (most recent call last):
File "C:\Users\aksho\anaconda3\conformal-time-series\tests\base_test.py", line 86, in
data['forecasts'] = generate_forecasts(data, **args['sequences'][0])
File "C:\Users\aksho\anaconda3\conformal-time-series\tests..\core\model_scores.py", line 90, in generate_forecasts
model_forecasts = model.historical_forecasts(y,future_covariates=exog,forecast_horizon=fit_every, retrain=retrain, verbose=True).values()[:,0].squeeze()
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\darts\utils\utils.py", line 143, in sanitized_method
return method_to_sanitize(self, *only_args.values(), **only_kwargs)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\darts\models\forecasting\forecasting_model.py", line 1051, in historical_forecasts
model._fit_wrapper(
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\darts\models\forecasting\forecasting_model.py", line 386, in _fit_wrapper
self.fit(series=series, **add_kwargs, **kwargs)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\darts\models\forecasting\forecasting_model.py", line 2788, in fit
return self._fit(series, future_covariates=future_covariates)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\darts\models\forecasting\arima.py", line 167, in _fit
self.model = m.fit()
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\arima\model.py", line 395, in fit
res = super().fit(
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 703, in fit
mlefit = super().fit(start_params, method=method,
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\base\model.py", line 566, in fit
xopt, retvals, optim_settings = optimizer._fit(f, score, start_params,
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\base\optimizer.py", line 243, in _fit
xopt, retvals = func(objective, gradient, start_params, fargs, kwargs,
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\base\optimizer.py", line 660, in _fit_lbfgs
retvals = optimize.fmin_l_bfgs_b(func, start_params, maxiter=maxiter,
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_lbfgsb_py.py", line 237, in fmin_l_bfgs_b
res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_lbfgsb_py.py", line 407, in _minimize_lbfgsb
f, g = func_and_grad(x)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_differentiable_functions.py", line 296, in fun_and_grad
self._update_fun()
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_differentiable_functions.py", line 262, in _update_fun
self._update_fun_impl()
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_differentiable_functions.py", line 163, in update_fun
self.f = fun_wrapped(self.x)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\scipy\optimize_differentiable_functions.py", line 145, in fun_wrapped
fx = fun(np.copy(x), *args)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\base\model.py", line 534, in f
return -self.loglike(params, *args) / nobs
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 938, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 1001, in loglike
kfilter = self._filter(**kwargs)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 921, in _filter
self._initialize_state(prefix=prefix, complex_step=complex_step)
File "C:\Users\aksho\anaconda3\envs\qls\lib\site-packages\statsmodels\tsa\statespace\representation.py", line 1058, in _initialize_state
self._statespaces[prefix].initialize(self.initialization,
File "statsmodels\tsa\statespace_representation.pyx", line 1373, in statsmodels.tsa.statespace._representation.dStatespace.initialize
File "statsmodels\tsa\statespace_representation.pyx", line 1362, in statsmodels.tsa.statespace._representation.dStatespace.initialize
File "statsmodels\tsa\statespace_initialization.pyx", line 288, in statsmodels.tsa.statespace._initialization.dInitialization.initialize
File "statsmodels\tsa\statespace_initialization.pyx", line 406, in statsmodels.tsa.statespace._initialization.dInitialization.initialize_stationary_stationary_cov
File "statsmodels\tsa\statespace_tools.pyx", line 1548, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.LinAlgError: LU decomposition error.

Does this issue looks familiar to you by any chance why it is happening specifically using ARIMA , and I am constantly hitting this with the same model config with darts.models.forecasting.ARIMA

Would be great if you have come across any solution with this as I was paralleling figuring out what could have been the potential reason behind this error.

Edit

ImportError: cannot import name 'Tensor' from 'torch' (unknown location)

Hi Aangelopoulos ,
I was trying to repro your code 'https://github.com/aangelopoulos/conformal-time-series' as suggested from the paper was trying to follow your steps as how you quoted:-
To reproduce the experiments in our paper, clone this repo and run the following code from the root directory.

conda create --name pid
pip install -r requirements.txt
cd tests
bash run_tests.sh
bash make_plots.sh

While triggering bash run_tests.sh I encounter import module issue here:-

Error Code Snippet:-

ImportError Traceback (most recent call last)
Cell In[106], line 7
5 from core import standard_weighted_quantile, trailing_window, aci, aci_clipped, quantile, quantile_integrator_log, quantile_integrator_log_scorecaster
6 from core.synthetic_scores import generate_scores
----> 7 from core.model_scores import generate_forecasts
8 from datasets import load_dataset
9 from darts import TimeSeries

File ~\Downloads\conformal-time-series\tests..\core\model_scores.py:10
8 from darts.models.forecasting.arima import ARIMA
9 from darts.models.forecasting.theta import Theta
---> 10 from darts.models.forecasting.transformer_model import TransformerModel
11 from darts import TimeSeries
12 import pdb

File ~\anaconda3\lib\site-packages\darts\models\forecasting\transformer_model.py:13
10 import torch.nn as nn
12 from darts.logging import get_logger, raise_if, raise_if_not, raise_log
---> 13 from darts.models.components import glu_variants, layer_norm_variants
14 from darts.models.components.glu_variants import GLU_FFN
15 from darts.models.components.transformer import (
16 CustomFeedForwardDecoderLayer,
17 CustomFeedForwardEncoderLayer,
18 )

File ~\anaconda3\lib\site-packages\darts\models_init_.py:32
29 from darts.models.utils import NotImportedModule
31 try:
---> 32 from darts.models.forecasting.block_rnn_model import BlockRNNModel
33 from darts.models.forecasting.dlinear import DLinearModel
34 from darts.models.forecasting.global_baseline_models import (
35 GlobalNaiveAggregate,
36 GlobalNaiveDrift,
37 GlobalNaiveSeasonal,
38 )

File ~\anaconda3\lib\site-packages\darts\models\forecasting\block_rnn_model.py:14
11 import torch.nn as nn
13 from darts.logging import get_logger, raise_log
---> 14 from darts.models.forecasting.pl_forecasting_module import (
15 PLPastCovariatesModule,
16 io_processor,
17 )
18 from darts.models.forecasting.torch_forecasting_model import PastCovariatesTorchModel
20 logger = get_logger(name)

File ~\anaconda3\lib\site-packages\darts\models\forecasting\pl_forecasting_module.py:9
6 from functools import wraps
7 from typing import Any, Dict, Optional, Sequence, Tuple, Union
----> 9 import pytorch_lightning as pl
10 import torch
11 import torch.nn as nn

File ~\anaconda3\lib\site-packages\pytorch_lightning_init_.py:25
22 _logger.addHandler(logging.StreamHandler())
23 _logger.propagate = False
---> 25 from lightning_fabric.utilities.seed import seed_everything # noqa: E402
26 from lightning_fabric.utilities.warnings import disable_possible_user_warnings # noqa: E402
27 from pytorch_lightning.callbacks import Callback # noqa: E402

File ~\anaconda3\lib\site-packages\lightning_fabric_init_.py:30
24 # In PyTorch 2.0+, setting this variable will force torch.cuda.is_available() and torch.cuda.device_count()
25 # to use an NVML-based implementation that doesn't poison forks.
26 # pytorch/pytorch#83973
27 os.environ["PYTORCH_NVML_BASED_CUDA_CHECK"] = "1"
---> 30 from lightning_fabric.fabric import Fabric # noqa: E402
31 from lightning_fabric.utilities.seed import seed_everything # noqa: E402
32 from lightning_fabric.utilities.warnings import disable_possible_user_warnings # noqa: E402

File ~\anaconda3\lib\site-packages\lightning_fabric\fabric.py:39
37 from lightning_utilities.core.apply_func import apply_to_collection
38 from lightning_utilities.core.overrides import is_overridden
---> 39 from torch import Tensor
40 from torch.optim import Optimizer
41 from torch.utils.data import BatchSampler, DataLoader, DistributedSampler, RandomSampler, SequentialSampler

ImportError: cannot import name 'Tensor' from 'torch' (unknown location)

I was looking into similar issues over github but didn't find anything valuable workaround to tackle this , Could you let me know what could be the potential reason why it is happening to my case , Also I had checked version being updated in my case is the latest one being updated:-

conda list torch

packages in environment at C:\Users\aksho\anaconda3:

I tried downgrading the pytorch version to check if it supports to tackle this version bump but looks like the issue persists even then. Kindly check this and let me know if it is an open issue ?

-Kind Regards ,
Akshobhya.

ModuleNotFoundError: No module named 'prophet'

When I followed the "Getting started" instructions, I encountered an issue. The problem is: "To reproduce the experiments in our paper, clone this repository and run the following code from the root directory."

Traceback：
File "/home/conformal-time-series-main/tests/../core/model_scores.py", line 7, in
from darts.models.forecasting.prophet_model import Prophet
File "/home/python3.8/site-packages/darts/models/forecasting/prophet_model.py", line 12, in
import prophet
ModuleNotFoundError: No module named 'prophet'

What is the problem？

CSV file is missing some data

The cases CSV file is missing some data, such as 'forecast_0.1', 'forecast_0.5', and 'forecast_0.9'.