Comments (14)
DataFusion 37.0.0 is released: #9682 🌮
from arrow-datafusion.
Really interested to see your thoughts on custom indices, its something were currently working on to improve performance of listing table.
from arrow-datafusion.
aliases for character_length are defined in https://github.com/apache/arrow-datafusion/blob/4bd7c137e0e205140e273a7c25824c94b457c660/datafusion/functions/src/unicode/character_length.rs#L45 and is used during udf registration @ https://github.com/apache/arrow-datafusion/blob/4bd7c137e0e205140e273a7c25824c94b457c660/datafusion/core/src/execution/context/mod.rs#L2137
from arrow-datafusion.
BTW something cool for next week -- @milenkovicm WASM UDFs #9326 / #9326 (comment) / https://github.com/milenkovicm/wasaffi
from arrow-datafusion.
Really interested to see your thoughts on custom indices, its something were currently working on to improve performance of listing table.
In case anyone else is following along, here are some items related to indexing:
While I had this on the brain, @matthewmturner I also filed #9964 as I suspect others will be interested in helping make ListingTable faster too
from arrow-datafusion.
Review queue
Arrow
- apache/arrow-rs#5525
- apache/arrow-rs#5557 (review)
- apache/arrow-rs#5578
- apache/arrow-rs#5575 (review)
- apache/arrow-rs#5566
- apache/arrow-rs#5554
DataFusion
from arrow-datafusion.
Arrow:
DataFusion
from arrow-datafusion.
Sorry for the location and possibly dumb question, but shouldn't all functions be exported here: https://github.com/apache/arrow-datafusion/blob/main/datafusion/functions/src/unicode/mod.rs#L138
I am probably using the library incorrectly but (for example) LENGTH
will not be registered by the central context here: https://github.com/apache/arrow-datafusion/blob/2f550032140d42d1ee6d8ed86f7790766fa7302e/datafusion/core/src/execution/context/mod.rs#L1450
from arrow-datafusion.
Hi @seddonm1 👋
Sorry for the location and possibly dumb question, but shouldn't all functions be exported here: https://github.com/apache/arrow-datafusion/blob/main/datafusion/functions/src/unicode/mod.rs#L138
Yes that is the intention
I am probably using the library incorrectly but (for example)
LENGTH
will not be registered by the central context here:
I think length
is registered as an alias of character_length
from arrow-datafusion.
Thanks @alamb . If you look at the first link I cannot see anywhere where the length
alias is actually registered. My queries which previously worked started failing after this update. I haven't investigated if more functions have also not been registered.
from arrow-datafusion.
Arrow:
DataFusion
from arrow-datafusion.
Thanks @Omega359 . I was thrown as the aliases are not registered here https://github.com/apache/arrow-datafusion/blob/3ae029988754c3fd3eb000abd4b76e643b9cbc7b/datafusion/execution/src/registry.rs#L174
from arrow-datafusion.
Reviews:
@jayzhan211 is starting to pull aggregate functions out of the core: #9960
from arrow-datafusion.
Next week #10002
from arrow-datafusion.
Related Issues (20)
- Implement `hf://` / "hugging face" integration in datafusion-cli HOT 6
- Clippy failed on main: consider removing unnecessary double parentheses
- Convert builtin Sum aggregate function to UDAF
- FIRST/LAST_VALUE behavior changes HOT 3
- CLI cannot create external tables with format options
- `stride` arg of `array_slice()` should be optional HOT 2
- Precision/length parameter of varchar/char types is ignored HOT 2
- Feedback request for providing configurable UDF functions HOT 12
- DataFrame.except() does not work with structs in schema HOT 2
- Extract parquet statistics from `Time32` and `Time64` columns HOT 1
- Extract parquet statistics from `Interval` columns HOT 5
- Extract parquet statistics from `LargeBinary` columns
- Extract parquet statistics from `Duration` columns HOT 3
- Extract parquet statistics from `Decimal256` columns
- Extract parquet statistics from `LargeUtf8` columns HOT 2
- Extract parquet statistics from `f16` columns HOT 1
- Extract parquet statistics from timestamps with timezones HOT 1
- Repeat scalar function panics on negative repeat counts.
- Update split_part to support negative indexes vs failing
- Error `NamedStructField should be rewritten in OperatorToFunction with subquery` if query is wrapped in view HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow-datafusion.