This PR is part of an effort to improve integration of feast with model serving. Also

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Yeah, that makes sense. We can use KServe's <a href="https://github.

FYI <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Yeah, that makes sense. We can use KServe's <a href="ht

Add a family of Model objects to feast about feast HOT 20 OPEN

tokoko commented on August 26, 2024 3

Add a family of Model objects to feast

from feast.

Comments (20)

etirelli commented on August 26, 2024 4

@tokoko @franciscojavierarceo thank you for the write-up/spec. I have concerns around the overall vision for the project and our strategy to achieve that vision, in light of this RFE.
KubeFlow, as the open source MLOps platform, already supports and develops KFP, MR and KServe. There is clearly a gap there for the feature store that Feast can fill. However, in that architecture, Feast would be a service to both KFP and KServe, while MR would be responsible for recording the model metadata necessary to connect serving/inference with the feature store.
Adding model metadata and API into the feature store, as proposed here, seems to invert the dependency graph in that architecture and create an overlap with the responsibilities of those other components. I am afraid that if we move forward with this RFC, it will cause confusion for users and make our integration with kubeflow more complicated, creating friction both from a technical as well as an organizational perspective.
I would advise that we focus on the integration with these other components in the ecosystem instead.

from feast.

tokoko commented on August 26, 2024 1

yup, an alias sounds fine. Another thing that came to my mind, we will probably have to come up with an appropriate abstraction to put this functionality in (behind FeatureStore). I'm thinking of InferenceEngine or something similar with LocalInferenceEngine and RemoteInferenceEngine as (probably only) two implementations. That would enable us to apply security model to these methods as well. (fyi @dmartinol @redhatHameed)

from feast.

tokoko commented on August 26, 2024 1

@dmartinol thanks for joining in. (This will probably be lengthy) Let me start from your last point, it's probably crucial that we agree on high-level what we're trying to build here. Yes, this is definitely going out of 'feature store' territory and I'm also a bit uncomfortable about that. Having said that, if we find some utility in extending feast this way that's overlooked by other oss tools out there, we should still proceed. I know we might be committing a terrible case of a feature creep here, happy to be called out if you think we are 😄.

The RFC linked above goes into this, but let me briefly recap here as well... I think there are two general ways to bridge feature retrieval and model execution:

If a user already has a somewhat complex serving architecture in place (I'm thinking of a model mesh here, either kserve or seldon) then the best way to go about it would be to treat feature retrieval as a sort of a transformer/preprocessor "model". The control plane of kserve (or seldon) would take turns to call feast first and model endpoint after that passing retrieved feature values. One idea I floated around in the RFC to support this flow is to modify our http/grpc feature servers to make them OIP-compliant effectively treating our FeatureService objects as models. If you have a feature service 'example_servicein the registry, callinginferendpoint of a feature server for a "model"example_service` would return necessary feature values. Since both kserve and seldon talk OIP, integration should be pretty simple. This flow is probably appropriate when an org has hundreds of models running and relying on kserve/seldon makes sense.
This ticket comes into play for a simpler deployment flow, when you are able to spin up a model endpoint somewhere and also have feast deployment in place, but have no easy way to bridge the gap between the two. feast itself takes on the role of a very simple control plane and tries to abstract away model endpoints from the caller. There are also some seemingly beneficial side-effects, mainly the ability to rely on a feast online store for the storage of precomputed (meaning batch) or cached (written out after online calls with some ttl) results of particular models. Not sure if the same can be achieved with an external add-on, but that's also a possibility.

Regarding model registry, if registry indeed contains info about current deployment endpoints (not just trained artifacts), that might be a good idea, although I'm not sure where would be keep additional information regarding models (whether we want to enable cache or not, permissions and so on) if we don't have individual Model objects in our registry and instead rely on a list of models supplied by the registry IIUC.

from feast.

tokoko commented on August 26, 2024

@franciscojavierarceo Your PR prompted me to draw this up. wdyt?

from feast.

franciscojavierarceo commented on August 26, 2024

Yeah I like this. I think we should use both infer and predict that just aliases to infer. I think that will be intuitive for traditional MLEs working with tabular data.

from feast.

franciscojavierarceo commented on August 26, 2024

Yeah, that makes sense.

We can use KServe's protobuf definition too.

from feast.

franciscojavierarceo commented on August 26, 2024

FYI @rimolive

from feast.

franciscojavierarceo commented on August 26, 2024

@HaoXuAI @shuchu

from feast.

tokoko commented on August 26, 2024

Yeah, that makes sense.

We can use KServe's protobuf definition too.

I'm pretty sure that's the same thing as OIP, it just had a couple of name changes along the way. V2 Inference Protocol, KServe V2 Protocol, Open Inference Protocol... they are all the same.

from feast.

franciscojavierarceo commented on August 26, 2024

Yup! I just wanted to provide the link to it.

from feast.

dmartinol commented on August 26, 2024

@tokoko (thanks for sharing!) why don't you also look at the Model Registry capabilities?
This component provides a central repository to store and manage models, together with the serving endpoints for each deployment of a model artifact (for this I need to find exact doc references).

In short, instead of abstracting an InferenceEngine, we could abstract a ModelRegistry and provide an initial implementation for this Kubeflow component, WDYT?

BTW: This requirement appears more aligned with an external add-on for handling both feature retrieval and model execution together, rather than a core component expected within the 'feature store'. Wouldn't this introduce too many (potentially unnecessary) dependencies?

@jeremyary @etirelli

from feast.

franciscojavierarceo commented on August 26, 2024

This ticket comes into play for a simpler deployment flow, when you are able to spin up a model endpoint somewhere and also have feast deployment in place, but have no easy way to bridge the gap between the two. feast itself takes on the role of a very simple control plane and tries to abstract away model endpoints from the caller. There are also some seemingly beneficial side-effects, mainly the ability to rely on a feast online store for the storage of precomputed (meaning batch) or cached (written out after online calls with some ttl) results of particular models. Not sure if the same can be achieved with an external add-on, but that's also a possibility.

This was one of my goals when drafting the RFC. (1) Creating some standard abstractions that gives guidance to the community and (2) enabling faster retrieval of predictions.

from feast.

franciscojavierarceo commented on August 26, 2024

FYI here's the link to the PR @tokoko mentioned: #4288

from feast.

franciscojavierarceo commented on August 26, 2024

@etirelli thanks for the feedback here! Agree we shouldn't cause confusion and we should have clean and separate responsibilities from the other components.

I don't want to invert the dependency graph but I do want to enable faster model retrieval and I think there's a way to accomplish both of those things. I'll do more research and get back.

from feast.

tokoko commented on August 26, 2024

Maybe the way to go without blurring the lines too much would be to somehow build this as a 3rd party tool (inside feast-dev organization if possible???). Initially that was my plan a while ago, already have a repository named hefeasto internally 😄.

We could base that 3rd party tool on existing FeatureStore interface or even expose some new abstractions if needed (for storing cached values in online store, checking whether cache is stale or not, etc...) Some of those new abstractions might even prove useful while integrating with kserve or seldon.

from feast.

tokoko commented on August 26, 2024

To be clear, by better integration with kserve I mean that the flow on kserve side might look instead of like this (get_features_from_feast -> get_model_prediction_from_model_endpoint) more like this (check_if_feast_has_cached_result -(if not)> get_features_from_feast -> get_model_prediction_from_model_endpoint -> update_feast_cache)

from feast.

franciscojavierarceo commented on August 26, 2024

So I'm going to write a new spec that starts with two goals:

Batch Models: Serving Pre-computed Predictions for some entity
Cached Online Models: Caching predictions and updating them only when data sources change

This requires:

a. Retrieving a precomputed score from an online database
b. Recomputing the score somehow (e.g., calling a Kserve endpoint) and retrieving the computed score from an online database

I think that means we can do this in 3 ways:

KServe orchestrates the call to Feast and the KServe endpoint
- This is how the KServe/Feast demo operates today but it's not meant for Batch Models
  - Modifying it to support Batch Models is straight forward but it'd basically just be a call to Feast
Feast orchestrates
- This is similar to my PR (if instead of an ODFV, we actually called KServe) and satisfies (1) and (2)
New library to handle orchestration
- This would be a light-weight library that would basically be what (2) does but outside of Feast

There are pros and cons to each and it'd be good to discuss them with the Kubeflow community to work through them to come to a consensus.

I think the solution recommended may end up depending upon the needs of the users. At Affirm, we wanted very low latency since we were serving a customer checkout, so an additional network hop was suboptimal and therefore calling Feast first would make more sense. Having KServe call Feast requires an additional RPC but it is a more intuitive architecture and makes the API call more intuitive.

from feast.

franciscojavierarceo commented on August 26, 2024

I'm going to create a new issue outlining my note above so we can discuss this with the Feast community there.

from feast.

franciscojavierarceo commented on August 26, 2024

@tokoko this is a useful blog post https://medium.com/marvelous-mlops/getting-started-with-databricks-feature-serving-bf590b318c31

from feast.

franciscojavierarceo commented on August 26, 2024

And it has this diagram:

from feast.

Add a family of Model objects to feast about feast HOT 20 OPEN

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent