Comments (5)
Yeah, so if we encountered a SortExpr in expr_to_sql function, does that mean unexpected location ? I think for a entire query case, the code will probably never reach that.
I think so
But
col("a").sort(true, true)
will be only "a", which doesn't match the sql semantic? 🤔
Creating SQL from programatically created Expr
s is an interesting usecase.
it seems like the issue is that Expr
in datafusion can come from both ast::Expr
and ast::OrderByExpr
Currently we have
datafusion/datafusion/sql/src/unparser/expr.rs
Lines 47 to 50 in dd56837
Maybe we could change the signature to something like
/// DataFusion's Exprs can represent either an `Expr` or an `OrderByExpr`
enum Unparsed {
// SQL Expression
Expr(ast::Expr),
// SQL ORDER BY expression (e.g. `col ASC NULLS FIRST`)
OrderByExpr(ast::OrderByExpr),
}
pub fn expr_to_sql(expr: &Expr) -> Result<Unparsed> {
...
}
from arrow-datafusion.
@kevinmingtarja @alamb What do you think? 🤔
from arrow-datafusion.
I don't think we should change SQL parser
In general, ORDER BY ...
isn't really a general Expr (it can't appear in arbitrary locations in SQL, only specific ones)
For example, this isn't valid;
SELECT x ORDER BY y FROM foo;
ORDER BY can appear in certain places like
SELECT x FROM foo ORDER BY y
SELECT agg(x ORDER BY y) FROM foo;
SELECT agg(..) OVER (ORDER BY y) FROM foo
I think we should special case finding Expr::Sort
exprs in those locations.
Maybe the code would be clearer if we made the conversion code return an Error if it encountered a SortExpr in an unexpected location 🤔
from arrow-datafusion.
Yeah I was kinda wondering about this too last time (#9726 (comment)).
I was thinking we don't change the SQL parser too.
Instead, I think given that the "sources" of the conversion to Expr
is not only ast::Expr
objects, having its return type be only ast::Expr
is perhaps too restrictive? As we've seen in this case. If I understand correctly, I think another case is Expr::AggregateFunction
, where we can't really convert it back to ast::Function as it's also not an enum item in ast::Expr
.
So what if we extend the return type of expr_to_sql
to more than ast::Expr
? I think it makes sense to extend it to all the "sources" of conversions from ast::*
to Expr
.
For example in here, we know that we are converting a ast::OrderByExpr
to a Expr::Sort
, so I think it makes sense that ast::OrderByExpr
should also be a valid return type for expr_to_sql
, as that is really the "source" data model of Expr::Sort
.
datafusion/datafusion/sql/src/expr/order_by.rs
Lines 24 to 36 in f8c623f
from arrow-datafusion.
Thank you all for your reply!
when turning the expression back into an entire query
Yeah, I was just a little curious why for Expr::Sort we ignore the ORDER BY information. It seems that for the entire query, it could address the Sort.
datafusion/datafusion/sql/src/unparser/plan.rs
Lines 222 to 231 in f8c623f
Maybe the code would be clearer if we made the conversion code return an Error if it encountered a SortExpr in an unexpected location 🤔
Yeah, so if we encountered a SortExpr
in expr_to_sql
function, does that mean unexpected location ? I think for a entire query case, the code will probably never reach that.
I know a use case would be just convert a Expr
to sql. But col("a").sort(true, true)
will be only "a", which doesn't match the sql semantic? 🤔
If I understand correctly, I think another case is Expr::AggregateFunction, where we can't really convert it back to [ast::Function]
(https://docs.rs/sqlparser/latest/sqlparser/ast/struct.Function.html#) as it's also not an enum item in ast::Expr.
ast::Function is an enum item in ast::Expr, see
https://github.com/sqlparser-rs/sqlparser-rs/blob/0b5722afbfb60c3e3a354bc7c7dc02cb7803b807/src/ast/mod.rs#L683-L684
So maybe the return type of expr_to_sql being ast::Expr is enough, given the Sort
will be addressed in a special way?
from arrow-datafusion.
Related Issues (20)
- Stop copying LogicalPlan and Exprs in `SingleDistinctToGroupBy` HOT 4
- Stop copying LogicalPlan and Exprs in `EliminateNestedUnion` HOT 1
- Enable bloom filters by default on read HOT 5
- Create fixed size list table with syntax <type name>[<length>] HOT 6
- Using `date_bin` with a time zone in a time range that contains daylight savings does not work HOT 3
- binary_op should be handled by sql_expr_to_logical_expr.. This was likely caused by a bug in DataFusion's code HOT 4
- [Epic] A Collection of Sort Based Optimizations HOT 5
- Implement a way to preserve partitioning through `UnionExec` without losing ordering
- DataFusion should support casting strings such as "4e7" to decimal HOT 2
- Optimized version of `SortPreservingMerge` that doesn't actually compare sort keys of the key ranges are ordered HOT 15
- Regression with `coalesce` expr_fn function: can't take multiple arguments
- INSERT INTO SQL failing on CSV-backed table HOT 3
- Unify SQL planning for `ORDER BY`, `HAVING`, `DISTINCT`, etc
- Unable to perform lead/lag built in functions on List and Struct data types HOT 1
- Enable `split_file_groups_by_statistics` by default HOT 3
- Avoid inlining non deterministic CTE HOT 2
- Make all SchemaProvider trait APIs async HOT 4
- Document timezone semantics HOT 2
- Schema incorrect after select over aggregate function that returns a different type than the input HOT 5
- clippy failure in main HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow-datafusion.