Comments (11)
Thanks. statsmodels would definitely be interested in picking up formulaic. statsmodels/statsmodels#6858 We have models that have formulas in R but we can't support with patsy, which has proven hard to extend.
from formulaic.
Hi again @bashtage ,
The formula grammar should be largely identical, however there are transformations that I have not implemented. For example, I have not implemented support for arbitrary contrast matrices. Support can easily be added for these things, I just haven't needed them for my use-cases. I am not aware of any limitations from a framework perspective. If something in particular is valuable to you, I can definitely look at including it.
The API is different, obviously, so you won't be able to just import formulaic and have everything work transparently.
from formulaic.
I put some time today into documentation. It doesn't render quite as well as it will in mkdocs, but you can see a comparison of the grammars here: https://github.com/matthewwardrop/formulaic/blob/a19ebfd77c8129fd5827218588053432b489e7a9/docs/basic/formulas.md .
from formulaic.
Does formulaic avoid eval
and allows pickling of formula information?
from formulaic.
@josef-pkt eval
is avoided when there are no python transforms involved (e.g. https://github.com/matthewwardrop/formulaic/blob/master/formulaic/materializers/base.py#L283). As soon as there are Python transforms involved, an eval
will occur. Code is compiled into an abstract syntax tree, modified slightly, and evaluated (e.g.
eval
. It would be tricky if your transforms require nested Python objects, but it seems doable. Perhaps there is motivation enough for a 'safe mode'?
As for pickling... yes. Formulas and ModelSpec instances can be pickled.
from formulaic.
There was recently a stackoverflow question, that I don't find anymore, asking for a safe eval in patsy when users of their program provide the formula string.
I never really use eval, but something that restricts what can be run in formula strings would be good to have for users that get formula strings from someone else.
In general I think avoiding eval or having a safe eval would be good to have also if it doesn't allow all features.
(E,g, I think it's weird that R formulas take the data from the outer scope when users don't provide them explicitly. That's too much magic for my taste.)
from formulaic.
Aye. This "magic" is used by formulaic in its model_matrix(..)
sugar method, largely for compatibility with expectations from patsy, but also because it is sometimes convenient. If you use the Formula(...).get_model_matrix(..)
this magic does not occur.
I'll add an issue for later thought about adding a safe mode!
from formulaic.
@bashtage I'm hoping to wrap up a new major release shortly :). Did you find any other issues in your testing?
from formulaic.
I'm going to close this one out for now. Formulaic is heading toward stability now, and I'm hopeful a 1.0 release can be made before the end of the year (if not sooner). I'll create a separate issue to document the migration from patsy.
from formulaic.
from formulaic.
Not yet! I'll bump it up the priority list.
from formulaic.
Related Issues (20)
- drop both columns in dependent variable and design matrix when missings occur HOT 5
- DOC: Explicitly mention support for multiple variables on the left hand side HOT 3
- Terms not being evaluated in get_model_matrix() HOT 2
- 17 tests fail: ModuleNotFoundError: No module named 'interface_meta' HOT 2
- How can the encoding choices for one dataset be reused for another? HOT 3
- Intercept term breaks when RHS formula begins with a parentheses HOT 2
- How do I set the reference level for a categorical term? HOT 4
- Support for sympy >= 1.10 HOT 3
- ENH: Preserve variable order as they appear in formulas HOT 5
- 2 tests fail HOT 1
- Interaction between two categorical covariates sometimes switches order, causing error HOT 3
- Intercept is not added after being removed HOT 4
- Proposal: support columns representing multiple features HOT 3
- Formulaic struggles with NAs and `poly()` syntax HOT 3
- Escaped variables and functions HOT 3
- How to include structural zeros? HOT 1
- Retain Column Names for sparse model matrices HOT 4
- Formulaic not raising an exception when required fields are missing in the dataset HOT 2
- Allow formatting the categorical encoded variables HOT 4
- Throw error when formula has parameters that are not available HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from formulaic.