Create a model server using TFServing. Component of <a class="issue-

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

/assign <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-

@cliveseldon <a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi @cliveseldon I have followed the instructions at <a href="https://github.com/Seldon

[GH Issue Summarization] Create a model server about examples HOT 8 CLOSED

kubeflow commented on July 21, 2024

[GH Issue Summarization] Create a model server

from examples.

Comments (8)

ukclivecox commented on July 21, 2024 1

Hi @ankushagarwal

See here for definition. You can send a Tensor or NDArray or custom string or binary. NDArray would seem to make sense for your case.
For example see some of the notebooks, for example in the kubeflow-seldon example so you send something like

payload = {"data":{"ndarray":["the","cat,"sat","on","the", "mat"]}}

from examples.

jlewi commented on July 21, 2024

/assign @ankushagarwal

Ankush can you describe the problems you were running into turning the model into a model that can be served with TF serving?

Would it be easier to serve the model using Seldon?

from examples.

ankushagarwal commented on July 21, 2024

The model used for issue summarization is very different from the examples that we've been using. For our image models, the model prediction looks something like this: output = model(input)

But for the issue summarization model, it looks something like this

output = '<START>'

intermediate_result = encoder_model(input)

while True:
  intermediate_result, next_char = decoder_model(intermediate_result, output)
  if next_char == '<STOP>':
    return output
  output += next_char

The first issue that I had was - exporting Keras models as Tensorflow models which can be used by TFServing - this is mostly done.

The second challenge that I have is understanding how TFServing works with

multiple models (encoder_model and decoder_model)
models with multiple inputs and multiple outputs (decoder_model)

Would it be easier to serve the model using Seldon?
I am not familiar enough with Seldon...

from examples.

ankushagarwal commented on July 21, 2024

I am having issues with the model exported from keras imported into tfserving.

I get this error when I send a Prediction request to the TFServing server

AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Expected multiples argument to be a vector of length 3 but got length 2
[[Node: Encoder-Last-GRU_1/Tile = Tile[T=DT_FLOAT, Tmultiples=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Encoder-Last-GRU_1/ExpandDims, Encoder-Last-GRU_1/Tile/multiples)]]")

Could not find a workaround for this. Will give seldon or tornado a shot to serve this Keras model.

We can probably illustrate serving a model with TFServing in another example which trains a tensorflow model directly.

from examples.

jlewi commented on July 21, 2024

@cliveseldon @gsunner Do you think we should try to use Seldon here?

Could we use the existing Seldon model server rather than creating our own Tornado stub?

Should we deploy the model using Seldon Core rather than deploying it directly with K8s resources?

from examples.

ukclivecox commented on July 21, 2024

@jlewi For a sklearn model, seldon-core would seem to be a good choice.

@ankushagarwal You specify a seq-to-seq model, but does the external business app send the whole sequence of characters in a single request to get a sequence back? if so then that should fit fine into the seldon-core prediction payload using NDArray. Your prediction component would need to split the request and then do as you specify in pseudo-code above.

Suggest you look at https://github.com/kubeflow/example-seldon which contains a sklearn model in the example code.

@jlewi Not sure I follow your last two questions. It would seem preferable to use the most appropriate serving solution TfServing or seldon-core rather than starting to build a new serving solution.

from examples.

ankushagarwal commented on July 21, 2024

Hi @cliveseldon I have followed the instructions at https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python.md and wrapped my model into a docker image. I am able to run the image locally and it is serving a REST API server at port 5000.

My question is what is the API to send a prediction request to the server. I could not find docs on that.

from examples.

ankushagarwal commented on July 21, 2024

Closing since we have a seldon model server.

from examples.

[GH Issue Summarization] Create a model server about examples HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent