Comments (8)
Glad to be able to help. One more pointer that may help you out, but note that this my preferred direction that I'm trying to push (but with not much luck as of yet :) ). Despite your preference for polars, you should probably still check out duckdb PR I linked above. The actual offline store implementation is written using ibis rather than duckdb directly. As ibis has a fairly good polars backend, you could easily reuse the same ibis implementation. In that case, polars implementation might be just a single line code change (probably not but something close to that).
from feast.
@ion-elgreco Let me try to give you a quick rundown of options how the integration might look like. First of all, The concept closest to backend
in feast is an OfflineStore
, but offline store implementations don't just specify the sources and how they should be read, they also implement additional logic on top of it (point-in-time join between entity dataframe and feature tables). That's why it's unlikely that we can have a deltalake
offline store implementation as there's no way to specify data transformations with deltalake
. The closest thing to what you're looking for is probably a polars
implementation (it's using delta-rs if i'm not mistaken, right?) or something like duckdb
that can be extended to use delta-rs for working with delta tables (I already have a draft PR that adds duckdb minus delta #3822).
Feast has another concept called DataSource
. This is how you specify the sources that offline stores will have to read later on.
The implementation you might be interested in is FileSource
as @sudohainguyen pointed out, that allows users to specify file format, but currently only parquet format is supported. So the first logical step should be to extend FileSource to allow users to specify delta as a file format. Once we have that, we can teach various offline store implementations (jvm-based or otherwise) how to read them.
from feast.
as I understand you want to query a feature table as delta format, spark
and trino
can help.
feast does support both of them
from feast.
No I would like to do this without a JVM application. So delta-rs Python bindings (deltalake) can be used to achieve this: https://github.com/delta-io/delta-rs
from feast.
cool, we need some changes to extend FileSource to read delta tables, do you mind contributing?
from feast.
Sure, if you can give me some pointers : )
from feast.
@tokoko gotcha, that helps! Since I mainly use Polars I will look into adding that as an offline store and then add delta as additional filesource using deltalake as dependency.
Yup Polars uses deltalake to read and write.
from feast.
Great explaination @tokoko !
Looking forward to seeing changes
from feast.
Related Issues (20)
- feastdev/feature-server{,-java} 0.38.0 & 0.39.0 tags missing from Docker Hub
- feast gcp notebook, getting TypeError: Client.__init__() got an unexpected keyword argument 'database' HOT 1
- Use modular fixture functions for integration test environments
- OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type HOT 1
- PostgresOnlineStore: Improve materialization HOT 17
- Is Feast helpful in case of the pictures and video as a features?
- FeastObject type is missing SavedDataset HOT 5
- Google cloud datastore: Client.__init__() got an unexpected keyword argument 'database' HOT 4
- Discussion: Postgres Online Store: Is connection handling done properly?
- new Dask version lock pandas to 2.0+
- Enhance the python feature server with new `list` endpoints. HOT 9
- Add Open Inference Protocol to feature servers HOT 3
- Add support for get_historical_features for vector search HOT 1
- Deprecation of distutils in python 3.12 breaking feast init HOT 1
- no image `feastdev/feast-operator:0.37.0`
- Rewrite RemoteOnlineStore to use get_online_features HOT 1
- Update CI to have a test for main feast dependency before release/deployment HOT 1
- Minor doc fix of page feature-servers
- Security fix for possible Cross-site-scripting (XSS) attack
- feast materialize can not load all data into online store HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from feast.