Comments (5)
ye, return_type_from_exprs
doesn't help.
I got around it mostly, with I think good performance by rewriting the query when there's a cast, so:
select * from foo where json_get(attributes, 'bar')::string='ham'
Will be rewritten to:
select * from foo where json_get_str(attributes, 'bar')='ham'
The main requirement I have now is that the error when you do try to compare a union is more helpful and less ugly.
from arrow-datafusion.
I took a look a bit, and I found that your return_type
is Union
, but If I understand correctly, you should compute the return type based on args. For example, in your test json_get(json_data, 'foo')
, if you compute the return_type based on foo
, you return Int
, then you will not meet the error.
The error is because Union is not yet supported in comparison_coercion
from arrow-datafusion.
@jayzhan211 that doesn't work since the argument types don't tell you what type will be returned.
e.g.:
- if the value in column
foo
is{"x": "abc"}
, thenjson_get(foo, 'x')
will return a string - but if the value in column
foo
is{"x": 123}
, thenjson_get(foo, 'x')
will return an integer
However I think I have a work around, I'm requiring a cast, so you have to do json_get(foo, 'x')::string
or json_get(foo, 'x')::int
, then I'm using a FunctionRewrite
to rewrite the function from json_get
to json_get_str
or json_get_int
.
With that the only remaining issue is making the error less ugly and more informative.
from arrow-datafusion.
@samuelcolvin Did you also consider return_type_from_exprs
, if it does not work, I think we can either support Union in comparison_coercion
or better design ScalarUDFImpl to do more customize about the return type.
from arrow-datafusion.
The main requirement I have now is that the error when you do try to compare a union is more helpful and less ugly.
Do you mean the error message in comparision_coerion
?
called `Result::unwrap()` on an `Err` value: Plan("Cannot infer common argument type for comparison operation Union([(0, Field { name: \"null\", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), (1, Field { name: \"bool\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (2, Field { name: \"int\", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (3, Field { name: \"float\", data_type: Float64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (4, Field { name: \"string\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (5, Field { name: \"array\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), (6, Field { name: \"object\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} })], Sparse) = Int64")
from arrow-datafusion.
Related Issues (20)
- CSV `quote` parameter is ignored
- ArrayList data is leaked to outside unnest logical plan from the inner unnest logical plan that comes before a Limit HOT 1
- Upgrade object_store to 0.10.1 HOT 2
- `test_sort_1k_mem` failed with too many open file error when I run locally HOT 1
- Should the arg of `octet_length()` be a bare `Expr` instead of `Vec<Expr>`? HOT 3
- Modify `LogicalPlan::apply_with_subqueries()` and `LogicalPlan::visit_with_subqueries()` to return subqueries
- CI compilation failed: cannot find value `default_nullability` in this scope
- UNION ALL not correctly projects the floating numbers HOT 11
- Use lowercase `name` for aggregate function HOT 1
- DataFusion weekly project plan (Andrew Lamb) - May 27, 2024
- Decouple builtin aggregate function `first_value` from optimize rule `replace_distinct_aggregate`
- UNION ALL should strip table identifiers in its resulting schema
- Google Cloud Storage requests during query execution being performed in series HOT 2
- Integrate with the substrait integration test HOT 1
- `COPY ... PARTITIONED BY` with parquet causes "out of bounds" panic HOT 2
- clean up simple udwf example HOT 2
- Implement `hf://` / "hugging face" integration in datafusion-cli HOT 6
- Clippy failed on main: consider removing unnecessary double parentheses
- Convert builtin Sum aggregate function to UDAF
- FIRST/LAST_VALUE behavior changes HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow-datafusion.