Git Product home page Git Product logo

Comments (13)

IvanUkhov avatar IvanUkhov commented on June 15, 2024 1

I ended up writing a custom function for summarizing inferences, as I wanted to have medians and HDIs instead of means and quantile intervals. I will leave it here in case it can be helpful some time in the future:

def _summarize_original(
    data_before: pd.DataFrame,
    data_after: pd.DataFrame,
    posterior: tp.distributions.Distribution,
    predictive: tp.distributions.Distribution,
    standardization: Tuple[float, float],
    alpha: float = 0.05,
    draw_count: int = 1000,
    random_state: int = 42,
) -> pd.DataFrame:
    from causalimpact.inferences import build_cum_index
    from causalimpact.misc import maybe_unstandardize

    def _hdi(data: np.ndarray) -> np.ndarray:
        from arviz.stats import hdi
        return np.array([hdi(data, 1 - alpha) for data in data.T]).T

    y_before = predictive.sample(draw_count, seed=random_state)
    y_before = maybe_unstandardize(np.squeeze(y_before.numpy()), standardization)
    y_after = posterior.sample(draw_count, seed=random_state)
    y_after = maybe_unstandardize(np.squeeze(y_after.numpy()), standardization)

    pre_preds_means = np.median(y_before, axis=0)
    pre_preds_lower, pre_preds_upper = _hdi(y_before)
    pre_preds_means = pd.Series(pre_preds_means, index=data_before.index)
    pre_preds_lower = pd.Series(pre_preds_lower, index=data_before.index)
    pre_preds_upper = pd.Series(pre_preds_upper, index=data_before.index)

    post_preds_means = np.median(y_after, axis=0)
    post_preds_lower, post_preds_upper = _hdi(y_after)
    post_preds_means = pd.Series(post_preds_means, index=data_after.index)
    post_preds_lower = pd.Series(post_preds_lower, index=data_after.index)
    post_preds_upper = pd.Series(post_preds_upper, index=data_after.index)

    complete_preds_means = pd.concat([pre_preds_means, post_preds_means])
    complete_preds_lower = pd.concat([pre_preds_lower, post_preds_lower])
    complete_preds_upper = pd.concat([pre_preds_upper, post_preds_upper])

    data = pd.concat([data_before, data_after])
    point_effects_means = data.iloc[:, 0] - complete_preds_means
    point_effects_upper = data.iloc[:, 0] - complete_preds_lower
    point_effects_lower = data.iloc[:, 0] - complete_preds_upper

    z_after = np.cumsum(data_after.iloc[:, 0].values - y_after, axis=1)
    post_cum_effects_means = np.median(z_after, axis=0)
    post_cum_effects_lower, post_cum_effects_upper = _hdi(z_after)
    index = build_cum_index(data_before.index, data_after.index)
    post_cum_effects_lower = pd.Series(
        np.concatenate([[0], post_cum_effects_lower]).ravel(),
        index=index,
    )
    post_cum_effects_means = pd.Series(
        np.concatenate([[0], post_cum_effects_means]).ravel(),
        index=index,
    )
    post_cum_effects_upper = pd.Series(
        np.concatenate([[0], post_cum_effects_upper]).ravel(),
        index=index,
    )

    data = dict(
        complete_preds_means=complete_preds_means,
        complete_preds_lower=complete_preds_lower,
        complete_preds_upper=complete_preds_upper,
        point_effects_means=point_effects_means,
        point_effects_lower=point_effects_lower,
        point_effects_upper=point_effects_upper,
        post_cum_effects_means=post_cum_effects_means,
        post_cum_effects_lower=post_cum_effects_lower,
        post_cum_effects_upper=post_cum_effects_upper,
    )
    return pd.DataFrame(data)

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

Hi @IvanUkhov ,

The intervals computed not only in this package but in the original R's already are the credible intervals. In fact, the original paper even compares confidence intervals with their credible ones at page 265 (or 19 in this pdf). Before latest paragraph, it says "The latter is expected, given that our intervals represent prediction intervals, not confidence intervals...".

What probably happened here is that throughout the code I wrote "confidence" instead of "credible" which is really a typo. The tensorflow_probability used in this package also returns credible intervals.

One point to consider though is that for this package I computed the lower and upper pointwise predictions using a similar approach as described by tfp comment referenced above, i.e., using z-scores with their means and stddevs whereas the original R package do so by selecting the proper percentiles via the posterior trajectory samples.

I'll see what happens by doing the same (I'm not expecting results to change much but it's probably worth the trial).

As for the mean vs median, as the model is built upon Kalman filters then posterior at each point t follows a normal distribution so it'd probably give the same results IIUC.

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

Thank you for the extended answer! Yes, I was referring to the logic based on a critical value and a standard deviation, which gives the right interval only if the underlying distribution is Gaussian. There is a lot of Gaussianity everywhere, but then the priors are non-Gaussian, which should lead to non-Gaussian posteriors in general. I suppose, if fitted via variational inference, the resulting posterior is indeed a product of independent Gaussians, but with MCMC, it should not be the case. It seems to be an unnecessary assumption.

Please do not get me wrong. I am by no means criticizing. It is a great package, and I simply wanted to discuss to see if there is anything that can be made even better.

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

Hi @IvanUkhov ,

Please feel free to ask as many questions as you like :)!

I think I better understand now what you mean. Notice that despite of which distribution we choose for the initial state model (in fact, default model of this package uses an inverse gamma for local level component) the probability of y` (prediction of observed time series) will always follow a normal distribution.

This is as per implementation of the LinearGaussian which holds for both Python and R packages. You can see that for TFP the probability of the state for time t is already a normal from samples taken from the chosen priors.

So extracting credible intervals either with z-scores or percentiles could be considered safe operations as it's guaranteed they'll be applied on Gaussians, regardless of the posterior of the state model we choose.

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

Hi @IvanUkhov ,

Newest version of tfcausalimpact is already publicly available. Any issues please let me know :)

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

Hello, thank you for the response! I was hoping to get more answers before I share this, but my reasoning is as follows:

tensorflow/probability#1252

What are your thoughts? Would you agree that the predictive distribution is non-Gaussian?

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

That's an interesting question, I'm not quite sure now what's the answer.

AFAIK, the predictive posterior follows something like (considering local level as an example):

image

(these equations were taken from Wei Yi's medium post).

So the predictive posterior should follow a normal from the posterior of the computed states. But then when we see in the tfp.sts.forecast code as you mentioned there's indeed the returning of the MixtureSameFamily which uses as input a previous state model already fitted with the posterior samples of the parameters. In the comments they state something like: "build a batch over state space models (...)" so I'm not sure if this is some technique to improve precision (just guessing).

I'm also awaiting (and hoping) the TFP team will give us some help here :)

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

The above formula is for one particular assignment of the model parameters, and it is simply the Kalman filter. But from the inference, we get a full posterior for the parameters in the form of samples. For each sample, we get the above expression, and the final distribution is a mixture of these Gaussian ones. Recall what distinguishes structural time series from Bayesian structural time series. It is one extra component: Bayesian model averaging.

But again, this is just my interpretation. Yes, let us hope someone will answer in tensorflow/probability#1252 🙂

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

Maybe steve-the-bayesian can also help on this. He's the original creator of the R bsts package.

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

I have also asked on Stack Exchange:

https://stats.stackexchange.com/questions/512666/on-the-predictive-distribution-of-a-bayesian-structural-time-series-model

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

Here we have got an answer from the author of forecast:

tensorflow/probability#1252

The conclusion is that the predictive distribution is non-Gaussian in general.

from tfcausalimpact.

WillianFuks avatar WillianFuks commented on June 15, 2024

Nice! Now it makes sense why they have the MixtureFamily.

I don't quite understand why they create the state space model twice but this is probably related to the inner workings of their implementation.

from tfcausalimpact.

IvanUkhov avatar IvanUkhov commented on June 15, 2024

I close this then. I think switching to quantile intervals was a good move. Next time around, one can consider HDIs, but probably it would not make much of a difference unless very skewed distributions are expected.

Thank you, and sorry for this much noise here 🙂

from tfcausalimpact.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.