vianneymi / monggregate Goto Github PK
View Code? Open in Web Editor NEWLibrary to make MongoDB aggregation framework and pipelines easy to use in python.
Home Page: https://vianneymi.github.io/monggregate/
License: MIT License
Library to make MongoDB aggregation framework and pipelines easy to use in python.
Home Page: https://vianneymi.github.io/monggregate/
License: MIT License
Cf #68
This should only be required when db is set
Takes into account that it's not an apple to apple comparison as
SortByCount only takes str (field path) as of now when it could take expressions
Ex :
{ $sortByCount: { lname: "$employee.last", fname: "$employee.first" } }
This breaks
pipeline = Pipeline(collection="movies")
pipeline.match(
query={
"title":"A Star Is Born"
}
).sort(
by="year",
ascending=True
)
while it shouldn't.
It is because a branch is missing validates_boolens:
elif descending is None and isinstance(ascending, bool):
descending = not ascending
When passing a field name with by and a negative direction with descending=True, the sort is still performed in ascending order
# TODO : Eventually, add support to merge multiple collections at once with union_with with below prototype
# def union_with(
# self,
# *,
# collection:str|None=None,
# pipeline:list[dict]|None=None,
# collections:list[str]|None=None,
# pipelines:list[list[dict]]|None=None,
# collection_pipeline_pairs:list[tuple[str, list[dict]]]|None=None,
# )
I'll use this issue to adress multiple issues that wil be referenced below
pymongo should be required only for pipeline to directly executable.
Currently instantiating a pipeline in a environment that does not contain pymongo crashes
This will allow to build more complex aggregation by referencing other interface objects that will automatically be translated into statements.
E.g. passing a pipeline instance to a lookup statement, or a an accumulator operator to a group statement
Make sure that the requirements really capture the package requirements and that the package can be built in newly environments with packages respecting the ranges or set versions described in requirements
If some users need the package for older versions of python, I might consider create a backport. I just need to know if it is worth "the effort"
References
Implement subset of operators to at least cover what is used in the below tutorial:
https://www.mongodb.com/developer/languages/python/python-quickstart-aggregation/#your-first-aggregation-pipeline
That is :
Currently, only the stage classes are tested but not their mirror methods in the pipeline class
For attributes that are typed as being expressions, we should provide an alternative type for simple cases and to enforce validation.
Ex:
class Limit(BaseModel):
value : Expression = 0
cannot have a constraint on value because of the Expression type
however, this should work
class Limit(BaseModel):
value : int | Expression = Field(gt=0)
The check of combination of arguments is wrong
# Check combination of arguments
if right and left_on and right_on and \
not(let or pipeline is None):
# in a subquery to select all on the foreign collection
# pipeline can be an empty list which is falsy
type_ = "simple"
elif let and pipeline is not None and not(left_on or right_on):
type_ = "uncorrelated"
elif let and pipeline is None and left_on and right_on:
type_ = "correlated"
else:
raise TypeError("Incompatible combination of arguments")
return type_
The stage classes leveraging pydantic use field aliases for attribute names. We should also allow this in their mirror pipeline functions
This is for testing pull requests on GitHub
pydantic v2 broke autocompletion in the package because of the way BaseModel is imported now
Add tests to at least ensure that the statements are those expected
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.