py-econometrics / wildboottest Goto Github PK
View Code? Open in Web Editor NEWpython module for wild cluster bootstrapping
Home Page: https://py-econometrics.github.io/wildboottest/
License: MIT License
python module for wild cluster bootstrapping
Home Page: https://py-econometrics.github.io/wildboottest/
License: MIT License
E.g. allow tests for R'beta = r, with R a vector of length K and r a scalar.
See the discussion here.
return np.nan
Right now we don't fun a f-stat for the full model, but the statsmodels table has it. Setting it to np.nan for now.
The arguments in the function reference, N_G_bootcluster
, but the function uses two arguments, bootcluster
and N_G
. Was the underscore a typo?
Wildboottest
method to Wildboottest_CR
Wildboottest_HR
Make the bootcluster argument optional - set it to 'cluster' by default
What do you think about publishing so that statsmodels
can add it to dependencies?
Discuss design choices, i.e.:
Our state of departure is the following:
model.fit(cov_type='wildclusterbootstrap',
cov_kwds = {'cluster' : cluster,
'B' : 9999,
'weights_type' : 'rademacher',
'impose_null' : True,
'bootstrap_type' : '11',
'seed' : None,
'param' : 'X1'}).summary()
The internal function which is called, wildboottest()
, returns a pvalue for a bootstrapped t-test of the null hypothesis impose_null
argument.
In consequence, we can report a full "regression table" as below by looping over all parameters [x1, ..., x10]
and also impose the null hypothesis for each hypothesis on the bootstrap dgp. For this use case, it would be nice if we could allow a 'param' value of e.g. 'ALL', which would loop 10 times over the internal wildboottest()
, imposing the null if impose_null = True
or not.
Second, if a user selects to test only one parameter, e.g. X1, it would be great if we could simply output the column for X1. I suppose that most times, researchers are not interested in inferences on the full set of regression coefficients, so computing all of them would only be wasteful. This is how fwildclusterboot, boottest and WildBootTests.jl all operate, but not what statsmodels
does. Maybe we should ask the statsmodels
maintainers what they think about this?
Last, once the bootstrapped vcov matrix is available, we can compute an F-statistic, but we can no longer impose the null hypothesis on the bootstrap dgp as we would use only one bootstrapped vcov matrix for all k coefficient tests. In consequence, specifying impose_null = True
needs to lead to an error. With a bootstrapped vcov, inference (both p-values and confidence intervals) can then be based on bootstrapped t-statistics (percentile-t, strongly preferable and what currently happens) or asymptotic approximations (i.e. the t(G-1) distribution). In short, with a full vcov matrix, we should be able to support all features of a regular statsmodels regression table, including standard errors, at the cost of being able to impose the null hypothesis on the wild cluster bootstrap dgp.
All in all, providing a bootstrapped vcov and se's leads to deviations from fwildclusterboot, boottest and WildBootTests.jl, which does not necessarily make me happy. 😄
Still, I lean towards having both, but we need to make a few decision as sketched above.
I hope this is all understandable - let me know what you think @amichuda ! =)
OLS Regression Results
=======================================================================================
Dep. Variable: y R-squared (uncentered): 0.737
Model: OLS Adj. R-squared (uncentered): 0.735
Method: Least Squares F-statistic: 278.0
Date: Mon, 17 Oct 2022 Prob (F-statistic): 2.58e-279
Time: 18:35:01 Log-Likelihood: -1769.6
No. Observations: 1000 AIC: 3559.
Df Residuals: 990 BIC: 3608.
Df Model: 10
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 -0.0457 NaN -1.016 0.310 -0.134 0.043
x2 -1.1441 NaN -24.947 0.000 -1.234 -1.054
x3 0.3851 NaN 8.243 0.000 0.293 0.477
x4 -0.6970 NaN -15.284 0.000 -0.786 -0.607
x5 -0.5754 NaN -12.380 0.000 -0.667 -0.484
x6 0.0367 NaN 0.790 0.430 -0.054 0.128
x7 0.2766 NaN 5.957 0.000 0.185 0.368
x8 -1.1516 NaN -25.925 0.000 -1.239 -1.064
x9 0.5564 NaN 12.022 0.000 0.466 0.647
x10 -1.2981 NaN -28.412 0.000 -1.388 -1.208
==============================================================================
Omnibus: 0.445 Durbin-Watson: 0.955
Prob(Omnibus): 0.800 Jarque-Bera (JB): 0.327
Skew: -0.007 Prob(JB): 0.849
Kurtosis: 3.087 Cond. No. 1.18
==============================================================================
Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[3] Standard errors not available for wildclusterbootstrap
Currently, there are dimension errors in matrix multiplications, and likely other errors.
This code here uses a bootcluster
variable that is from the original R code coming from a pre-processed object.
According to the software website it is:
allows the user to specify subclusters via the bootcluster argument.
Can you explain that?
For examples, see the WildBootTests.jl and fwildclusterboot's boot_algo_julia.R function
Since self.k
is used for creating matrices and such for X
, should we redefine self.k = self.X.shape[1]
, not self.k = self.R.shape[0]
, and then make an error if self.k != self.R.shape[0]
.
I think this makes more sense, since the user can make a mistake in defining R which might propagate through the code and show up in place not related to conducting the statistical test.
Do you concur?
Is fe
supposed to be a long variable of being in a particular entity/group (like hhids) or are they already supposed to be dummy variables?
... else it is rather slow.
For features in version 0.1:
Test the bootstrap algos against R & Julia. strategy: create data + run algo in R, Julia, Python + save all scrips + hard-code test values (or safe in file) for actual tests.
Implement the following tests - each for different weights, bootstrap types, null on dgp, etc...
External Tests:
full_enumeration = False
, are bootstrapped p-values almost identical? Are non-bootstrapped t-stats exactly identical?Internal Tests:
As soon as fwildclusterboot allows these variants (currently in the dev branch).
#TODO: Is this just a permutation function?
undefined
Add CI support.
a= np.array([-1, 1]) * (np.sqrt(5) + np.array([-1, 1])) / 2 #TODO: Should this divide the whole expression by 2 or just the second part
undefined
Currently, running wildboottest()
twice under the same seed leads to different inferences. This is of course terrible for reproducibility and we should fix it.
Example:
import pandas as pd
import statsmodels.formula.api as sm
from wildboottest.wildboottest import wildboottest
import numpy as np
df = pd.read_csv("https://raw.github.com/vincentarelbundock/Rdatasets/master/csv/sandwich/PetersenCL.csv")
df['treat'] = np.random.choice([0, 1], df.shape[0], True)
model = sm.ols(formula='y ~ treat', data=df)
wildboottest(model, param = "treat", cluster = df.year, B= 999, seed = 12)
# | param | statistic | p-value |
# |:--------|------------:|----------:|
# | treat | 1.129 | 0.255 |
wildboottest(model, param = "treat", cluster = df.year, B= 999, seed = 12)
# | param | statistic | p-value |
# |:--------|------------:|----------:|
# | treat | 1.129 | 0.289 |
Overall, follow exposition in MacKinnon (2022, Econometrics) - adjust for new bootstrap types.
Should we allow for no cluster input, and in which case, does it just turn into a regular wild bootstrap? Or should there be an error?
Both is more or less the standard in the literature. For compatibility with choices made via statsmodels
/ linearmodels
, add the option to overwrite these default values.
Note that for p-values, the choice of small sample corrections does not have an impact (as the ssc's cancel out when computing p-values by counting cases of scc x t_stat < ssc x t_boot). They only affect the internally computed and reported non-bootstrapped test statistic.
In __init__
, you need to specify B, but in order to run the boostrap, you need to run get_weights
which overrides the attribute. Should B not be specified then?
A lot of the data inputs you have in the boot_algo3
function in R assumes a dataframe. Should we just do the same and assume pandas dataframes in the python version? We can then turn them into numpy arrays in the function for performance.
... as in classical AB test setups without covariates. The same problem exists for fwildclusterboot.
In a first test of Python vs R, it looks like the '31' bootstrap t-statistics match exactly under full enumeration (as they should), while the '11' do not:
See here
I suppose that the R version is correct, as it's tested against WildBootTests.jl and produces matching statistics.
>>> df (Python)
WCR11 WCR31 WCU11 WCU31
0 -5.569437e-07 7.119511e-16 -0.547569 0.079819
1 -3.286834e+00 -3.286832e+00 -4.243206 -3.085644
2 1.715372e-02 1.715392e-02 -0.166832 0.053646
3 -2.881170e+00 -2.881171e+00 -1.780821 -3.045997
4 2.881170e+00 2.881171e+00 1.780821 3.045997
5 -1.715372e-02 -1.715392e-02 0.166832 -0.053646
6 3.286834e+00 3.286832e+00 4.243206 3.085644
7 5.569437e-07 -7.119511e-16 0.547569 -0.079819
>>> r_df (R)
Unnamed: 0 WCR11 WCR31 WCU11 WCU31
0 1 -0.620824 -1.017073e-16 -0.547213 0.079819
1 2 -3.852912 -3.286832e+00 -4.198954 -3.085644
2 3 -0.196266 1.715392e-02 -0.171808 0.053646
3 4 -1.630894 -2.881171e+00 -1.782077 -3.045997
4 5 1.630894 2.881171e+00 1.782077 3.045997
5 6 0.196266 -1.715392e-02 0.171808 -0.053646
6 7 3.852912 3.286832e+00 4.198954 3.085644
7 8 0.620824 1.017073e-16 0.547213 -0.079819
Currently, we report t-stats with more than 10 digits behind the comma.
# 0 X1 [-1.0530803154504016] 0.308831
#TODO: Is this just a permutation function?
permutations(
n = 2,
r = N_G_bootcluster,
v = c(1, -1),
repeats.allowed = TRUE
)
... as the package is now "live" on pypi!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.