py-econometrics / wildboottest Goto Github PK

View Code? Open in Web Editor NEW

7.0 3.0 0.0 4.23 MB

python module for wild cluster bootstrapping

Home Page: https://py-econometrics.github.io/wildboottest/

License: MIT License

Python 100.00%

linear-regression wild-cluster-bootstrap cluster-robust-inference

wildboottest's Introduction

wildboottest

wildboottest implements multiple fast wild cluster bootstrap algorithms as developed in Roodman et al (2019) and MacKinnon, Nielsen & Webb (2022).

It has similar, but more limited functionality than Stata's boottest, R's fwildcusterboot or Julia's WildBootTests.jl. It supports

The wild cluster bootstrap for OLS (Cameron, Gelbach & Miller 2008, Roodman et al (2019)).
Multiple new versions of the wild cluster bootstrap as described in MacKinnon, Nielsen & Webb (2022), including the WCR13, WCR31, WCR33, WCU13, WCU31 and WCU33.
CRV1 and CRV3 robust variance estimation, including the CRV3-Jackknife as described in MacKinnon, Nielsen & Webb (2022).
The (non-clustered) wild bootstrap for OLS (Wu, 1986).

At the moment, wildboottest only computes wild cluster bootstrapped p-values, and no confidence intervals.

Other features that are currently not supported:

The subcluster bootstrap (MacKinnon and Webb 2018).
Confidence intervals formed by inverting the test and iteratively searching for bounds.
Multiway clustering.

Direct support for statsmodels and linearmodels is work in progress.

If you'd like to cooperate, either send us an email or comment in the issues section!

Installation

You can install wildboottest from PyPi by running

pip install wildboottest

Example

import pandas as pd
import statsmodels.formula.api as sm
from wildboottest.wildboottest import wildboottest

df = pd.read_csv("https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv")
model = sm.ols(formula='y ~ x', data=df)

wildboottest(model, param = "x", cluster = df.firm, B = 9999, bootstrap_type = '11')
# | param   |   statistic |   p-value |
# |:--------|------------:|----------:|
# | x       |      20.453 |     0.000 |

wildboottest(model, param = "x", cluster = df.firm, B = 9999, bootstrap_type = '31')
# | param   |   statistic |   p-value |
# |:--------|------------:|----------:|
# | x       |      30.993 |     0.000 |

# bootstrap inference for all coefficients
wildboottest(model, cluster = df.firm, B = 9999, bootstrap_type = '31')
# | param     |   statistic |   p-value |
# |:----------|------------:|----------:|
# | Intercept |       0.443 |     0.655 |
# | x         |      20.453 |     0.000 |

# non-clustered wild bootstrap inference
wildboottest(model, B = 9999, bootstrap_type = '11')
# | param     |   statistic |   p-value |
# |:----------|------------:|----------:|
# | Intercept |       1.047 |     0.295 |
# | x         |      36.448 |     0.000 |

wildboottest's People

Contributors

Stargazers

Watchers

wildboottest's Issues

Allow to call WildBootTests.jl via PyJulia

For examples, see the WildBootTests.jl and fwildclusterboot's boot_algo_julia.R function

Implement the WCRx1 and WCUx1 via MacKinnon (2021)

Small sample corrections

For the x1 bootstraps, I will implement a ssc correction of $(N-1) / (N-k)$.
For the x3 bootstraps, $(G-1) / G$

Both is more or less the standard in the literature. For compatibility with choices made via statsmodels / linearmodels, add the option to overwrite these default values.

Note that for p-values, the choice of small sample corrections does not have an impact (as the ssc's cancel out when computing p-values by counting cases of scc x t_stat < ssc x t_boot). They only affect the internally computed and reported non-bootstrapped test statistic.

Fix bugs in '31' bootstrap types

Currently, there are dimension errors in matrix multiplications, and likely other errors.

Enable Jit Compilation of the Heteroskedastic bootstrap

Update readme

Add Pypi installtion instructions to readme

... as the package is now "live" on pypi!

`bootcluster` parameter

This code here uses a bootcluster variable that is from the original R code coming from a pre-processed object.

According to the software website it is:

allows the user to specify subclusters via the bootcluster argument.

Can you explain that?

https://github.com/s3alfisc/wildboottest/blob/b33df5fca1ea72986b47d33ebc575bc4068b4d81/src/boot_algo3.py#L33

Test heteroskedastic bootstrap types "21" and "31" against fwildclusterboot

As soon as fwildclusterboot allows these variants (currently in the dev branch).

Implement bootstrap types x3 (with CRV3 vcov used for covariance estimation in bootstrap samples)

Publish `wildboottest` on pypi

What do you think about publishing so that statsmodels can add it to dependencies?

Continuous Integration for Tests

Add CI support.

wildboottest without clusters

Should we allow for no cluster input, and in which case, does it just turn into a regular wild bootstrap? Or should there be an error?

Allow for regression weights in bootstrap algorithm

simply multiply dependent variable and design matrix with $\sqrt(W)$?

Check that all examples in docstrings run properly

[Bug] Fixed seed leads to different results

Currently, running wildboottest() twice under the same seed leads to different inferences. This is of course terrible for reproducibility and we should fix it.

Example:

import pandas as pd
import statsmodels.formula.api as sm
from wildboottest.wildboottest import wildboottest
import numpy as np

df = pd.read_csv("https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv")
df['treat'] = np.random.choice([0, 1], df.shape[0], True)
model = sm.ols(formula='y ~ treat', data=df)

wildboottest(model, param = "treat", cluster = df.year, B= 999, seed = 12)
# | param   |   statistic |   p-value |
# |:--------|------------:|----------:|
# | treat   |       1.129 |     0.255 |

wildboottest(model, param = "treat", cluster = df.year, B= 999, seed = 12)
# | param   |   statistic |   p-value |
# |:--------|------------:|----------:|
# | treat   |       1.129 |     0.289 |

Is this just a permutation function?

        #TODO: Is this just a permutation function?

undefined

Set up a Python package structure

do different weights lead to different inferences?

Implement heteroskedastic wild bootstrap

rename Wildboottest method to Wildboottest_CR
create new method Wildboottest_HR

bootcluster argument optional

Make the bootcluster argument optional - set it to 'cluster' by default

Round printed output to a sensible number

Currently, we report t-stats with more than 10 digits behind the comma.

# 0    X1  [-1.0530803154504016]  0.308831

Error in bootstrap algo for '11' types?

In a first test of Python vs R, it looks like the '31' bootstrap t-statistics match exactly under full enumeration (as they should), while the '11' do not:

See here

I suppose that the R version is correct, as it's tested against WildBootTests.jl and produces matching statistics.

>>> df (Python)
          WCR11         WCR31     WCU11     WCU31
0 -5.569437e-07  7.119511e-16 -0.547569  0.079819
1 -3.286834e+00 -3.286832e+00 -4.243206 -3.085644
2  1.715372e-02  1.715392e-02 -0.166832  0.053646
3 -2.881170e+00 -2.881171e+00 -1.780821 -3.045997
4  2.881170e+00  2.881171e+00  1.780821  3.045997
5 -1.715372e-02 -1.715392e-02  0.166832 -0.053646
6  3.286834e+00  3.286832e+00  4.243206  3.085644
7  5.569437e-07 -7.119511e-16  0.547569 -0.079819
>>> r_df (R)
   Unnamed: 0     WCR11         WCR31     WCU11     WCU31
0           1 -0.620824 -1.017073e-16 -0.547213  0.079819
1           2 -3.852912 -3.286832e+00 -4.198954 -3.085644
2           3 -0.196266  1.715392e-02 -0.171808  0.053646
3           4 -1.630894 -2.881171e+00 -1.782077 -3.045997
4           5  1.630894  2.881171e+00  1.782077  3.045997
5           6  0.196266 -1.715392e-02  0.171808 -0.053646
6           7  3.852912  3.286832e+00  4.198954  3.085644
7           8  0.620824  1.017073e-16  0.547213 -0.079819

Possible type in `boot_algo3`

@s3alfisc

The arguments in the function reference, N_G_bootcluster, but the function uses two arguments, bootcluster and N_G. Was the underscore a typo?

https://github.com/s3alfisc/wildboottest/blob/b33df5fca1ea72986b47d33ebc575bc4068b4d81/src/boot_algo3.py#L3

Allow option to return "full" bootstrapped vcov matrix

See the discussion here.

Should we just assume pandas dataframes as input?

A lot of the data inputs you have in the boot_algo3 function in R assumes a dataframe. Should we just do the same and assume pandas dataframes in the python version? We can then turn them into numpy arrays in the function for performance.

Prettify output

Throw error when model passed to `wildboottest()` is not of type statsmodels.regression.linear_model.OLS

does seeding work? E.g. same seeds -> same results, different seeds -> different results?

Fix Documentation

example for Wildboottest class not running as R has wrong dimension
some minor points in the docs that became more visible after creating the mkdocs site

Jit compile x3 bootstrap

... else it is rather slow.

Problems with WCR and tests with only one covariates

... as in classical AB test setups without covariates. The same problem exists for fwildclusterboot.

Implement two-way Clustering

Is this just a permutation function?

        #TODO: Is this just a permutation function?
        permutations(
            n = 2,
            r = N_G_bootcluster,
            v = c(1, -1),
            repeats.allowed = TRUE
        )

`self.k` should be defined by `X` not `R`

@s3alfisc

Since self.k is used for creating matrices and such for X, should we redefine self.k = self.X.shape[1], not self.k = self.R.shape[0], and then make an error if self.k != self.R.shape[0].

I think this makes more sense, since the user can make a mistake in defining R which might propagate through the code and show up in place not related to conducting the statistical test.

Do you concur?

Provide an API for linearmodels

Add github actions for mkdocs website build

Implement CRV3 variance-covariance estimator

R code for the cluster jackknife in the (summclust R package](https://github.com/s3alfisc/summclust/blob/main/R/cluster_jackknife.R)

Number of bootstrap iterations, B, overridden by `get_weights`

@s3alfisc

In __init__, you need to specify B, but in order to run the boostrap, you need to run get_weights which overrides the attribute. Should B not be specified then?

Repo Cleanup

Delete files that are not required, e.g.
... any other files that are not required?
Check that package metadata is accurate etc.
...

Allow scalar tests of multiple coefficients

E.g. allow tests for R'beta = r, with R a vector of length K and r a scalar.

start with the heteroskedastic bootstrap (as it is less work)
continue with the wild cluster bootstrap

Implement proper error messages

API discussions

Some thoughts regarding a statsmodels API

Our state of departure is the following:

model.fit(cov_type='wildclusterbootstrap',
             cov_kwds = {'cluster' : cluster,
                         'B' : 9999,
                         'weights_type' : 'rademacher',
                         'impose_null' : True, 
                         'bootstrap_type' : '11', 
                         'seed' : None,
                         'param' : 'X1'}).summary()

The internal function which is called, wildboottest(), returns a pvalue for a bootstrapped t-test of the null hypothesis $X_1 = 0$ vs $X_1 \neq 0$. Further, users can impose the null hypothesis $X_1 = 0$ on the bootstrap data generating process via the impose_null argument.

In consequence, we can report a full "regression table" as below by looping over all parameters [x1, ..., x10] and also impose the null hypothesis for each hypothesis on the bootstrap dgp. For this use case, it would be nice if we could allow a 'param' value of e.g. 'ALL', which would loop 10 times over the internal wildboottest(), imposing the null if impose_null = True or not.

Second, if a user selects to test only one parameter, e.g. X1, it would be great if we could simply output the column for X1. I suppose that most times, researchers are not interested in inferences on the full set of regression coefficients, so computing all of them would only be wasteful. This is how fwildclusterboot, boottest and WildBootTests.jl all operate, but not what statsmodels does. Maybe we should ask the statsmodels maintainers what they think about this?

Last, once the bootstrapped vcov matrix is available, we can compute an F-statistic, but we can no longer impose the null hypothesis on the bootstrap dgp as we would use only one bootstrapped vcov matrix for all k coefficient tests. In consequence, specifying impose_null = True needs to lead to an error. With a bootstrapped vcov, inference (both p-values and confidence intervals) can then be based on bootstrapped t-statistics (percentile-t, strongly preferable and what currently happens) or asymptotic approximations (i.e. the t(G-1) distribution). In short, with a full vcov matrix, we should be able to support all features of a regular statsmodels regression table, including standard errors, at the cost of being able to impose the null hypothesis on the wild cluster bootstrap dgp.

All in all, providing a bootstrapped vcov and se's leads to deviations from fwildclusterboot, boottest and WildBootTests.jl, which does not necessarily make me happy. 😄

Still, I lean towards having both, but we need to make a few decision as sketched above.

I hope this is all understandable - let me know what you think @amichuda ! =)

                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.737
Model:                            OLS   Adj. R-squared (uncentered):              0.735
Method:                 Least Squares   F-statistic:                              278.0
Date:                Mon, 17 Oct 2022   Prob (F-statistic):                   2.58e-279
Time:                        18:35:01   Log-Likelihood:                         -1769.6
No. Observations:                1000   AIC:                                      3559.
Df Residuals:                     990   BIC:                                      3608.
Df Model:                          10                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1            -0.0457      NaN    -1.016      0.310      -0.134       0.043
x2            -1.1441      NaN    -24.947      0.000      -1.234      -1.054
x3             0.3851      NaN      8.243      0.000       0.293       0.477
x4            -0.6970      NaN    -15.284      0.000      -0.786      -0.607
x5            -0.5754      NaN    -12.380      0.000      -0.667      -0.484
x6             0.0367      NaN      0.790      0.430      -0.054       0.128
x7             0.2766      NaN      5.957      0.000       0.185       0.368
x8            -1.1516      NaN    -25.925      0.000      -1.239      -1.064
x9             0.5564      NaN     12.022      0.000       0.466       0.647
x10           -1.2981      NaN    -28.412      0.000      -1.388      -1.208
==============================================================================
Omnibus:                        0.445   Durbin-Watson:                   0.955
Prob(Omnibus):                  0.800   Jarque-Bera (JB):                0.327
Skew:                          -0.007   Prob(JB):                        0.849
Kurtosis:                       3.087   Cond. No.                         1.18
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[3] Standard errors not available for wildclusterbootstrap

Implement Confidence Intervals (by test inversion)

easy for the WCU: simply take quantiles of bootstrapped t-statistics- more work for WCR
allow hypothesis tests with different values of r as in $R \beta = r$
implement bisection algorithm

Overall, follow exposition in MacKinnon (2022, Econometrics) - adjust for new bootstrap types.

Provide an API for statsmodels

Discuss design choices, i.e.:

allow fixed effects?
how to handle clustering?

Implement Unit Tests

For features in version 0.1:

Test the bootstrap algos against R & Julia. strategy: create data + run algo in R, Julia, Python + save all scrips + hard-code test values (or safe in file) for actual tests.

Potentially set up CI via github actions (?)
Potentially set up codecov (?)

Implement the following tests - each for different weights, bootstrap types, null on dgp, etc...

External Tests:

Using full enumeration, are bootstrapped t-statistics exactly identical between R, Julia, Python when using the same small sample corrections?
When full_enumeration = False, are bootstrapped p-values almost identical? Are non-bootstrapped t-stats exactly identical?

Internal Tests:

`fe` in `bootalgo3`

@s3alfisc

Is fe supposed to be a long variable of being in a particular entity/group (like hhids) or are they already supposed to be dummy variables?

https://github.com/s3alfisc/fwildclusterboot/blob/99f5ec51392ea70a61d0e54dc53b739de057b3b5/R/boot_algo3.R#L71

Check accuracy of documentation

Have f-test model

     return np.nan

https://github.com/amichuda/statsmodels/blob/c74f4a6e3b593ad4c2ec4dc4adc1a36d8ab48cf4/statsmodels/regression/linear_model.py#L1853

Right now we don't fun a f-stat for the full model, but the statsmodels table has it. Setting it to np.nan for now.

Should this divide the whole expression by 2 or just the second part

        a= np.array([-1, 1]) * (np.sqrt(5) + np.array([-1, 1])) / 2 #TODO: Should this divide the whole expression by 2 or just the second part

undefined

py-econometrics / wildboottest Goto Github PK

wildboottest's Introduction

wildboottest

Installation

Example

wildboottest's People

Contributors

Stargazers

Watchers

wildboottest's Issues

Some thoughts regarding a statsmodels API

Recommend Projects

Recommend Topics

Recommend Org