Is your feature request related to a problem? Please describe. At

Thank you, <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support easier feature serving and model serving with KServe about feast HOT 12 OPEN

franciscojavierarceo commented on August 18, 2024 2

Support easier feature serving and model serving with KServe

from feast.

Comments (12)

franciscojavierarceo commented on August 18, 2024 1

I'm not particularly familiar with the second scenario you outlined but I do agree that Feast should be a toolbox that could support both scenarios. That said, and given my previous experience, I'm going to focus my contributions to the RFC on case (1) since it is not outlined today and it is the one I've encountered the most frequently (and my suspicion is that it's the most common pattern most practicioners actually need).

from feast.

franciscojavierarceo commented on August 18, 2024 1

Awesome! I'll review this in more detail later. Do you want to collaborate on the doc? Feel free to add your name and suggest some changes.

from feast.

tokoko commented on August 18, 2024 1

I started the solution section, please look through it when you get the chance, especially the part at the end under //TODO.

from feast.

shuchu commented on August 18, 2024 1

Thank you, @tokoko . I take back what I said before. I agree with you that it's better for us to keep Feast in a neutral way.

Recently, I work on Kubeflow for other purposes. It seems the doc about Feast is quite old on Kubeflow's website. Let me see if I can update it. I will let you and @franciscojavierarceo know what kind of situation we have with Kubeflow.

from feast.

tokoko commented on August 18, 2024

@franciscojavierarceo thanks for kickstarting this. As I said on the call, I have some concrete ideas about changes in feast that could make this possible, but before we go there, let me say a couple of things about the differences between the approaches (preprocessor vs transformer, if I'm getting it right). Although I probably lean towards your point of view, I don't really think we can recommend either approach. The way I think about it, the approach taken usually depends on the user's existing ml serving practices and infrastructure. 1) When infra mostly consists of one-off independent model services, it makes sense to simply add another layer in front that will take care of feature store communication. 2) Alternatively if you're already heavily invested into a web of composable transformers and models depending on other models (kserve, seldon and so on) it makes a lot of sense to treat it as just another transformer. I think we should try to create a toolbox in feast that would apply to both scenarios.

from feast.

tokoko commented on August 18, 2024

sure, I'm right there with you, that's my experience as well. another point is that I think we should aim to integrate with open inference protocol rather than any particular inference server. my understanding is that it's based on kserve v2 protocol and closest to the industry standard that there is. it's also pretty well-defined (both http and grpc) with a number of client implementations we could use, triton has good set of http and grpc python clients for it for example.

from feast.

tokoko commented on August 18, 2024

@franciscojavierarceo I have been working on this internally and came up with some draft middleware implementation. Let me know if you're thinking along those lines as well.

For simpler model deployments (without model mesh) we should have a way for users to deploy a middleware service that wraps both feast and actual model server. I think we can make the assumption that model server needs to be OIP-compatible and we can go a step further and make our own middleware expose OIP interface as well. The difference will be that OIP model server will expect actual features as inputs and our middleware will only expect entity values and required request features (if any). Here's my very detailed uml for it :). Making middleware server expose OIP will also simplify using it for clients as there will be no need for them to use feast and can employ standard protocols or already existing client libraries for them (tritonclient is the best one for OIP, i think).

For the python version, we can even base own server on existing frameworks, mlserver has a very easy way of implementing custom OIP services without worrying too much about the actual server plumbing.

Another point to note here is that users will be able to make these services part of the kserve (or probably seldon as well) model mesh, but i don't know if that's such a good idea. As we will be doing an oip call to the underlying model server ourselves from the middleware instead of relying on the mesh to do it for us, the mesh itself won't be able to track that bit of communication.

from feast.

franciscojavierarceo commented on August 18, 2024

Let's discuss more in the doc. I think what you're calling out completely makes sense.

I think exposing an OIP makes sense, the only detail being that (as I outlined in my doc) sometimes users will want to retrieve just features or (in the future) retrieve features as inference is being generated (to be discussed more later probably).

from feast.

franciscojavierarceo commented on August 18, 2024

The MLServer/Seldon seems really promising, as it supports PyTorch, Tensorflow, and XGBoost...

from feast.

tokoko commented on August 18, 2024

One problem there right now is that they are on pydantic<2 (we are on pydantic>2) so can't add to the project yet. They are planning an upgrade in the next release.

from feast.

shuchu commented on August 18, 2024

https://www.kubeflow.org/docs/external-add-ons/feature-store/

from feast.

tokoko commented on August 18, 2024

@shuchu MLServer is an open source project (and kserve supports deploying it as well afaik). You should probably take a look at the proposal so far, I'm only advocating for using MLServer (Apache 2.0) and tritonclient (BSD-3) as utilitiies that would help us build a server exposing oip interface.

As to the integration with the model server, I agree. I think we shouldn't have an integration that's coupled with either one of them. We should have a vendor-neutral integration with an abstract OIP model server, which the users can deploy and manage with both kserve and seldon.

from feast.

Support easier feature serving and model serving with KServe about feast HOT 12 OPEN

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent