Deion Missing request params, while using ens

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Missing request params, while using ensemble model about server HOT 2 CLOSED

DaniilKurlovich commented on September 25, 2024

Missing request params, while using ensemble model

from server.

Comments (2)

indrajit96 commented on September 25, 2024

Hi @DaniilKurlovich,
Looks like you are missing something in your setup.
I just tried your exact steps to repro, was unable to repro. Your client seems to have some minor issues attached working client below

My logs

I0325 23:39:50.100145 1 model.py:11] Params: {"top_k":10}
I0325 23:39:50.154021 1 model.py:10] Params: {"top_k":5}
I0325 23:39:50.157574 1 model.py:11] Params: {"top_k":7}
I0325 23:39:50.158148 1 model.py:10] Params: {"top_k":7}

My suggestions:

Use the latest triton server version 24.02
Modify you client

Client Code:

import asyncio

import numpy as np
import tritonclient.http.aio as tritonclient
from tritonclient.utils import np_to_triton_dtype

async def main():
    texts = ['Text1', 'Text2', 'Text3', 'Text4']

    preprocessed_data = np.array([e.encode('utf-8') for e in texts], dtype=np.object_)

    client = tritonclient.InferenceServerClient(url='localhost:8090')
    try:
        infer_input = tritonclient.InferInput('INPUT_NAMES',
                                              preprocessed_data.shape,
                                              np_to_triton_dtype(preprocessed_data.dtype))
        infer_input.set_data_from_numpy(preprocessed_data)

        resp = await client.infer('stomp_1', [infer_input], outputs=[tritonclient.InferRequestedOutput('EXIT')],
                                  parameters={'top_k': 10})
        print(resp.as_numpy('EXIT'))

        rand_matrix = resp.as_numpy('EXIT')
        infer_input_2 = tritonclient.InferInput('INPUT', rand_matrix.shape,
                                                np_to_triton_dtype(rand_matrix.dtype))
        infer_input_2.set_data_from_numpy(rand_matrix)

        resp2 = await client.infer('stomp_2', [infer_input_2], outputs=[tritonclient.InferRequestedOutput('EXIT_')],
                                   parameters={'top_k': 5})
        print(resp2.as_numpy('EXIT_'))

        resp = await client.infer('stomp_ensemble', [infer_input], outputs=[tritonclient.InferRequestedOutput('ANSWER')],
                                  parameters={'top_k': 7})
        print(resp.as_numpy('ANSWER'))

    finally:
        await client.close()


if __name__ == '__main__':
    asyncio.run(main())

from server.