Comments (3)
This is related to an issue on patsy pydata/patsy#156, where @FerusAndBeyond points out that running:
from formulaic import model_matrix
model_matrix('sys.exit()', ...)
will kill the running python service. Is there a way to protect formulaic from code injection?
from formulaic.
At the moment, formulae does not use eval()
(see here), but it is still vulnerable to sys.exit()
calls if you have sys
loaded in the environment where design_matrices()
is called.
I do think we could port something similar to Formulaic and attempt to ban people calling functions from certain modules.
EDIT
You could have the following scenarios
sys
module is not loaded
model_matrix("sys.exit()")
No problem here (I think?). It is going to raise an error because sys
is not going to be found.
sys
module is loaded
model_matrix("sys.exit()")
Python service is killed
sys
module is loaded but developer prevented it from being used
formulaic.disable_modules(["sys"])
model_matrix("sys.exit()")
Python service is not killed. An error is raised because you can't call anything from the sys
module.
Does it make sense?
from formulaic.
Yeah, that is correct!
It should also be noted model_matrix
is a user-convenience function. Software libraries should use Formula(...).get_model_matrix
which requires you to explicitly nominate the evaluation context, and it does not add the current evaluation context by default. You can simulate that in model_matrix
by passing None
or the the evaluation context dictionary via model_matrix(context=...)
, which then replaces the automatically imputed context.
I don't see avoiding eval
here to be that useful, since we're not actually adding any security if we will call the functions represented in the string anyway. I think the bigger thing is getting users familiar with the security risks, and using Formula().get_model_matrix()
in library code.
from formulaic.
Related Issues (20)
- __repr__() got an unexpected keyword argument 'to_str' HOT 1
- Query the number of formula terms HOT 5
- Sparse matrix creation is slow HOT 2
- Explicitly passed terms should be sorted like parsed terms. HOT 4
- Make a `Q` operator that behaves like patsy's Q. HOT 3
- Feature Request: circular transform HOT 3
- bug: incorrect metadata license file HOT 2
- Allow interaction with yourself HOT 1
- Test failures: `AssertionError: approx() is not supported in a boolean context.` HOT 4
- drop both columns in dependent variable and design matrix when missings occur HOT 5
- DOC: Explicitly mention support for multiple variables on the left hand side HOT 3
- Terms not being evaluated in get_model_matrix() HOT 2
- 17 tests fail: ModuleNotFoundError: No module named 'interface_meta' HOT 2
- How can the encoding choices for one dataset be reused for another? HOT 3
- Intercept term breaks when RHS formula begins with a parentheses HOT 2
- How do I set the reference level for a categorical term? HOT 4
- Support for sympy >= 1.10 HOT 3
- ENH: Preserve variable order as they appear in formulas HOT 5
- 2 tests fail HOT 1
- Interaction between two categorical covariates sometimes switches order, causing error HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from formulaic.