bentoml / bentoml Goto Github PK

The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

Home Page: https://bentoml.com

License: Apache License 2.0

Python 95.39% Shell 2.16% Makefile 0.11% Jinja 0.47% JavaScript 0.06% Dockerfile 0.30% CSS 0.05% Starlark 0.73% C++ 0.07% Go 0.04% Java 0.11% Kotlin 0.04% PHP 0.04% Swift 0.15% Rust 0.13% HTML 0.15%

model-serving model-deployment model-management ml-platform ai machine-learning mlops bentoml kubernetes deep-learning

bentoml's Introduction

BentoML: The Unified Model Serving Framework

BentoML is an open-source model serving library for building performant and scalable AI applications with Python. It comes with everything you need for serving optimization, model packaging, and production deployment.

👉 Join our Slack community!

Highlights

🍱 Bento is the container for AI apps

Open standard and SDK for AI apps, pack your code, inference pipelines, model files, dependencies, and runtime configurations in a Bento.
Auto-generate API servers, supporting REST API, gRPC, and long-running inference jobs.
Auto-generate Docker container images.

🏄 Freedom to build with any AI models

Import from any model hub or bring your own models built with frameworks like PyTorch, TensorFlow, Keras, Scikit-Learn, XGBoost and many more.
Native support for LLM inference, generative AI, embedding creation, and multi-modal AI apps.
Run and debug your BentoML apps locally on Mac, Windows, or Linux.

🤖️ Inference optimization for AI applications

Integrate with high-performance runtimes such as ONNX-runtime and TorchScript to boost response time and throughput.
Support parallel processing of model inferences for improved speed and efficiency.
Implement adaptive batching to optimize processing.
Built-in optimization for specific model architectures (like OpenLLM for LLMs).

🍭 Simplify modern AI application architecture

Python-first! Effortlessly scale complex AI workloads.
Enable GPU inference without the headache.
Compose multiple models to run concurrently or sequentially, over multiple GPUs or on a Kubernetes Cluster.
Natively integrates with MLFlow, LangChain, Kubeflow, Triton, Spark, Ray, and many more to complete your production AI stack.

🚀 Deploy anywhere

One-click deployment to ☁️ BentoCloud, the Serverless platform made for hosting and operating AI apps.
Scalable BentoML deployment with 🦄️ Yatai on Kubernetes.
Deploy auto-generated container images anywhere Docker runs.

Documentation

Installation: pip install bentoml
Documentation: docs.bentoml.com
Tutorial: Quickstart

🛠️ What you can build with BentoML

OpenLLM - Run any open-source LLMs, such as Llama 2 and Mistral, as OpenAI compatible API endpoints, locally and in the cloud.
BentoXTTS - Convert text to speech based on your custom audio data.
BentoSDXLTurbo - Create an image generation application and run inference with a single step.
BentoSD2Upscaler - Build an image generation application with upscaling capability.
BentoControlNet - Influence image composition, adjust specific elements, and ensure spatial consistency by integrating ControlNet with your image generation process.
BentoWhisperX - Convert spoken words into text for AI scenarios like virtual assistants, voice-controlled devices, and automated transcription services.
Sentence Transformer - Transform text into numerical vectors for a variety of natural language processing (NLP) tasks.
BentoCLIP - Build a CLIP (Contrastive Language-Image Pre-training) application for tasks like zero-shot learning, image classification, and image-text matching.
BentoBLIP - Leverage BLIP (Bootstrapping Language Image Pre-training) to improve the way AI models understand and process the relationship between images and textual descriptions.
BentoLCM - Deploy a REST API server for Stable Diffusion with Latent Consistency LoRAs.
BentoSVD - Create a video generation application powered by Stable Video Diffusion (SVD).
BentoVLLM - Accelerate your model inference and improve serving throughput by using vLLM as your LLM backend.
BentoBark - Generate highly realistic audio like music, background noise and simple sound effects with Bark.
BentoYolo - Build an object detection inference API server with YOLO.
RAG - Self-host a RAG web service with BentoML step by step, including an embedding model, a large language model, and a vector database.

Getting started

This example demonstrates how to serve and deploy a simple text summarization application.

Serving a model locally

Install dependencies:

pip install torch transformers "bentoml>=1.2.0a0"

Define the serving logic of your model in a service.py file.

from __future__ import annotations
import bentoml
from transformers import pipeline


@bentoml.service(
    resources={"cpu": "2"},
    traffic={"timeout": 10},
)
class Summarization:
    def __init__(self) -> None:
        # Load model into pipeline
        self.pipeline = pipeline('summarization')

    @bentoml.api
    def summarize(self, text: str) -> str:
        result = self.pipeline(text)
        return result[0]['summary_text']

Run this BentoML Service locally, which is accessible at http://localhost:3000.

bentoml serve service:Summarization

Send a request to summarize a short news paragraph:

curl -X 'POST' \
  'http://localhost:3000/summarize' \
  -H 'accept: text/plain' \
  -H 'Content-Type: application/json' \
  -d '{
  "text": "Breaking News: In an astonishing turn of events, the small town of Willow Creek has been taken by storm as local resident Jerry Thompson'\''s cat, Whiskers, performed what witnesses are calling a '\''miraculous and gravity-defying leap.'\'' Eyewitnesses report that Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly. The event, which took place in Thompson'\''s backyard, is now being investigated by scientists for potential breaches in the laws of physics. Local authorities are considering a town festival to celebrate what is being hailed as '\''The Leap of the Century."
}'

Deployment

After your Service is ready, you can deploy it to BentoCloud or as a Docker image.

First, create a bentofile.yaml file for building a Bento.

service: "service:Summarization"
labels:
  owner: bentoml-team
  project: gallery
include:
  - "*.py"
python:
  packages:
  - torch
  - transformers

Then, choose one of the following ways for deployment:

BentoCloud

Make sure you have logged in to BentoCloud and then run the following command:

bentoml deploy .

Docker

Build a Bento to package necessary dependencies and components into a standard distribution format.

bentoml build

Containerize the Bento.

bentoml containerize summarization:latest

Run this image with Docker.

docker run --rm -p 3000:3000 summarization:latest

For detailed explanations, read Quickstart.

Community

BentoML supports billions of model runs per day and is used by thousands of organizations around the globe.

Join our Community Slack 💬, where thousands of AI application developers contribute to the project and help each other.

To report a bug or suggest a feature request, use GitHub Issues.

Contributing

There are many ways to contribute to the project:

Report bugs and "Thumbs up" on issues that are relevant to you.
Investigate issues and review other developers' pull requests.
Contribute code or documentation to the project by submitting a GitHub pull request.
Check out the Contributing Guide and Development Guide to learn more
Share your feedback and discuss roadmap plans in the #bentoml-contributors channel here.

Thanks to all of our amazing contributors!

Usage Reporting

BentoML collects usage data that helps our team to improve the product. Only BentoML's internal API calls are being reported. We strip out as much potentially sensitive information as possible, and we will never collect user code, model data, model names, or stack traces. Here's the code for usage tracking. You can opt-out of usage tracking by the --do-not-track CLI option:

bentoml [command] --do-not-track

Or by setting environment variable BENTOML_DO_NOT_TRACK=True:

export BENTOML_DO_NOT_TRACK=True

License

Apache License 2.0

Citation

If you use BentoML in your research, please cite using the following citation:

@software{Yang_BentoML_The_framework,
author = {Yang, Chaoyu and Sheng, Sean and Pham, Aaron and  Zhao, Shenyang and Lee, Sauyon and Jiang, Bo and Dong, Fog and Guan, Xipeng and Ming, Frost},
license = {Apache-2.0},
title = {{BentoML: The framework for building reliable, scalable and cost-efficient AI application}},
url = {https://github.com/bentoml/bentoml}
}

bentoml's People

Contributors

Stargazers

Watchers

Forkers

parano yubozhao atalaya-io yinan98 xiao6666 gitpradeep54 curious95 sumedhkumarprasad sprinterzzj allensmile jingmouren atul-anand-jha hhy5277 vpanjeta linuxclab francishero shuzhiquan vincycode7 abit2 pratyushdeka a3digit sthpravin chandansinha redpoint13 muchenhearsay devhliu sruthi-racharla janevin supercd ssusantachary spencerai fengzifrank mohanarunachalam fossabot nitintushir0048 gitter-badger amir22010 abeusher 1e0ng bigrlab lotapp radidam iamharshverma sahanduiuc pplonski stjordanis jrdeco560 lambdaofgod mihirkawatra gianrisa fudp awesome-archive onlookerliu cmm16 withsmilo joychen0103 tshepomk databill86 sonicviz gurpreet-forks ryanhuangnlp loaiabdalslam afcarl lunarway anthonyoluitan mbrukman xinshang-iai bojiang leepand melnimr cxz 7lagrange lintingzhen smutuvi nanqiai krtk30 yoavz yingyuankai seantur kevin-hanselman kavilan-nair huangjinsuzhou rsyi zhaozihui shakirck gsukr yuhuishishishi ondrocks zhentan xiaolouge123 gchqresearcher83493 darkblaez chengs369 kmr0877 linhduongtuan ml-lab codeslord edco29 davidurpani srinivasgutta7

bentoml's Issues

Add support for H2O.ai models

The H2O.ai implementations of linear models, tree-based models and monotonically constrained models are really good. Moreover, H2O.ai supports AutoML as well.

Since BentoML has support for Scikit-Learn, Keras, TensorFlow and so son, I thought it might be a good idea to add H2O.ai models in this list.

Nevertheless, you guys are doing a pretty awesome job and I am preparing demo notebook to make myself accustomed with the workflow of BentoML. Will share once completed.

Add more info logs to BentoService#save

The find module call is slow and not printing anything is pretty bad UX
It's better to print out what the library is doing under the hood for the user, e.g. what python source code files are being copied, what artifacts are being saved etc.

Add deployment guide for Kubernetes

add curl examples to different notebooks

OpenAPI support in BentoML REST API server

Generate swagger api for the rest api server.

Assigning to BentoService field

Is your feature request related to a problem? Please describe.
I need to create a service that is able to handle new objects/fields added during it's lifetime. One of my methods gets called with data that is then supposed to be added to make querying it possible.

Describe alternatives you've considered
I tried assigning to field in method that does the setup. This doesn't work, after that other methods don't see that the field got assigned.
I also tried using artifacts: I packed needed fields with Nones, and then tried changing them. This also doesn't work, the corresponding fields seem unchanged.

Additional context
I'm trying to build a service that wraps kNN engine with model that does feature extraction (let's say we want to search text/images and we have appropriate neural network that does feature extraction, then we want to perform kNN on its output representation). The service is supposed to be built with a model. Then it can receive appropriate data and it will build kNN model that enables searching it.

Add deployment guide for Clipper.ai

Use s3fs instead of Boto3 for saving/reading from s3 location

error in iris classifier example

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

start jupyter notebook locally on Mac
follow example here and it will fail, as shown in the screenshot

Expected behavior
it shouldn't error out

Screenshots/Logs
If applicable, add screenshots, logs or error outputs to help explain your problem.

Environment:

Python/BentoML Version [e.g. Python 2.7, BentoML-0.2.0]

Additional context
Add any other context about the problem here.

Add AWS Lambda support in ImageHandler

GPU support

GPU docker image base for use in API server
GPU support in CLI mode
GPU support when used as python package

Add deployment guide on Heroku

Hi @parano, @yubozhao. I tried to set up a deployment pipeline of a simple API endpoint (created using BentoML) on Heroku. I am assuming that the bento-archive that gets generated by Bento is the main directory which would be required to in order to proceed with the deployment. Here's how that directory looks like -

Also, for your convenience, here's the GitHub repo with the files of the archive uploaded to it. I modified the requirements.txt accordingly as Heroku needs that in order to manage the dependencies. I thought it might be a good idea to suggest to include a guide for Heroku deployments as well since Heroku is very popular among API developers. Unfortunately, the build keeps getting failed on Heroku(note that the module class has been uploaded to PyPI as well, here's the link). Here's build log -

-----> Python app detected
-----> Installing python-3.6.8
-----> Installing pip
-----> Installing SQLite3
-----> Installing requirements with pip
       Obtaining file:///tmp/build_0cb7bbf6d47f1bcdd63e482cc791f0f9 (from -r /tmp/build_0cb7bbf6d47f1bcdd63e482cc791f0f9/requirements.txt (line 1))
       Collecting None (from PassengerClassifier==0.0.1->-r /tmp/build_0cb7bbf6d47f1bcdd63e482cc791f0f9/requirements.txt (line 1))
         Could not find a version that satisfies the requirement None (from PassengerClassifier==0.0.1->-r /tmp/build_0cb7bbf6d47f1bcdd63e482cc791f0f9/requirements.txt (line 1)) (from versions: )
       No matching distribution found for None (from PassengerClassifier==0.0.1->-r /tmp/build_0cb7bbf6d47f1bcdd63e482cc791f0f9/requirements.txt (line 1))
 !     Push rejected, failed to compile Python app.
 !     Push failed

It might help you guys further if you consider to include a guide on Heroku deployments. Might serve you a direction. Looking forward to hearing from you.

Add BentoService debug/development mode and workflow examples

Allow running a REST server or CLI command with

a python module with a custom BentoService definition
a directory containing saved artifacts required for the BentoService

This way users don't need to go through the .pack and .save every time when they change something in their API function or preprocessing code.

[Discussion] improve APIs for defining BentoService

Use gunicorn by default in BentoArchive generated docker image

Setup docs directory with readthedocs.org

Cannot use list in ImageHandler's input_name

Describe the bug
I need to create a model that compares two images - in ImageHandler documentation there is mention of input_name parameter which is supposedly string[] type, but when list argument is passed, it fails.

To Reproduce


@env(conda_pip_dependencies=['scikit-image'])
@ver(major=1, minor=0)
class ImageSimilarityScorer(BentoService):

    @api(ImageHandler, input_name=['original', 'compared'], accept_multiple_files=True)
    def predict(self, original, compared):
        return original[0, 0] == compared[0, 0]

BTW I also tried passing input_name as @api's keyword argument. Also it fails when I use tuple instead of list.

Screenshots/Logs
When running bentoml serve of my model I get

Traceback (most recent call last):
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functionsrule.endpoint
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/bentoml/server/bento_api_server.py", line 152, in docs_view_func
docs = get_docs(bento_service)
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/bentoml/server/bento_api_server.py", line 142, in get_docs
requestBody=OrderedDict(required=True, content=api.request_schema),
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/bentoml/service.py", line 97, in request_schema
return self.handler.request_schema
File "/home/kuba/Projects/venvs/ml_server/lib/python3.6/site-packages/bentoml/handlers/image_handler.py", line 91, in request_schema
self.input_name: {"type": "string", "format": "binary"}
TypeError: unhashable type: 'list'

Environment:

Python 3.6.8, BentoML-0.3.4

BentoService#save should copy local python modules

Describe the bug
this was a regression introduced when we refactor the archive module, will take a look and add tests

Handling image requests

I am referring to this tutorial on FashionMNIST classification with tf-keras.

I think the prediction API can only handle raw image arrays currently in the way it is coded. I was trying to serve it as REST API endpoint and running with bentoml serve. However, I wrote the prediction in a way it could handle images from the API payload directly. Here's the snippet:

%%writefile tf_keras_fashion_mnist.py

import bentoml
from bentoml import api, artifacts, env, BentoService
from bentoml.artifact import TfKerasModelArtifact
from bentoml.handlers import TensorflowTensorHandler
from io import BytesIO
from scipy.misc import imresize, imread

@bentoml.env(conda_dependencies=['tensorflow', 'numpy'])
@bentoml.artifacts([TfKerasModelArtifact('classifier')])
class TfKerasFashionMnistModel(bentoml.BentoService):
    
    @bentoml.api(TensorflowTensorHandler)
    def predict(self, request):
        class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
        
        input_file = request.files.get('file')
        if not input_file:
            return BadRequest("File is not present in the request")
        if input_file.filename == '':
            return BadRequest("Filename is not present in the request")
        if not input_file.filename.lower().endswith(('.jpg', '.jpeg', '.png')):
            return BadRequest("Invalid file type")

        input_buffer = BytesIO()
        input_file.save(input_buffer)
        image_array = imread(input_buffer, mode='L')
        
        image_array_re = imresize(image_array, (28, 28))
        image_array_re = image_array_re.reshape(1, 28, 28, 1)
        preds = model.predict(image_array_re)
        category_int = np.argmax(preds, axis=1)
        return class_names[int(category_int)]

However, when I do a hit via Postman I get the following error (with full trace):

Traceback (most recent call last):
  File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/miniconda3/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/miniconda3/lib/python3.7/site-packages/bentoml/server/bento_api_server.py", line 116, in wrapper
    response = api.handle_request(request)
  File "/miniconda3/lib/python3.7/site-packages/bentoml/service.py", line 91, in handle_request
    return self.handler.handle_request(request, self.func)
  File "/miniconda3/lib/python3.7/site-packages/bentoml/handlers/tensorflow_tensor_handler.py", line 30, in handle_request
    raise NotImplementedError
NotImplementedError

Any help in this regard would be great. Here's my notebook for your perusal.

Add Tensorflow SavedModel Artifact

Currently there is only artifact type for tf.keras model, this is for adding new artifact type to work with tensorflow SavedModel API.

Add deployment guide for AWS Sagemaker

Use semantic versioning for both BentoService and generated PyPI package

Currently they are not consistent and completely separated concept, we should just make them the same thing.

Tensorflow SavedModel artifact support

As recently requested by BentoML users, adding support for non-Keras Tensorflow model

Add --with-conda option to CLI command serve and run

Allow user to load and run a BentoArchive in its own conda environment.
We can use the BentoService name and version string to ensure unique conda env for every BentoArchive.

Add console_script entry to generated BentoML archive

Users should have direct cli access to their model after pip install /model_path

Problem accessing gunicorn server from kubernets ingress

Describe the bug
I created a deployment, service and ingress following the example in BentoML/examples/deploy-with-kubernetes/
(by the way the service-with-ingress.yaml contains an error. the service type is ClusterPort but should be ClusterIP)

The problem is that connections to the ingress are refused.

Even when tying to make requests directly to the pod on the 5000 port they get refused.

If I do a kubectl port-forward it works from localhost

I suppose that the problem is with the gunicorn bindings

this is the gunicorn log:
[2019-05-08 14:46:18 +0000] [1] [INFO] Starting gunicorn 19.9.0
[2019-05-08 14:46:18 +0000] [1] [INFO] Listening at: http://127.0.0.1:5000 (1)

and in the BentoML/bentoml/server/gunicorn_server.py file i see this:
def init(self, app, port, workers):
self.options = {'workers': workers, 'bind': '%s:%s' % ('127.0.0.1', port)}
self.application = app
super(GunicornApplication, self).init()

with 127.0.0.1 hardcoded in the code.

But I'm not completely sure that changing it to 0.0.0.0 would solve the problem.

AWS Lambda Deployment: error creating bentoml home dir and config file

Thanks @Mu7ammad for reporting this bug here: #235, creating a new issue for tracking:

Follow up: So I tried version 0.3.4 with same example and environment, and it indeed seems to package the dependencies this time, but then when I run the lambda function I get internal server error, and from the logs it says:

"Error creating bentoml home dir '/home/sbx_user1051/bentoml': No such file or directory"

It's thrown from executing: bentoml/config/init.py , line 61 with exception: "Error creating bentoml home dir"

It seems can't find the home directory in the sandbox, so what do you think the reason for that?

Thanks

This is due to AWS Lambda has limited disk access to '/tmp' directory: https://docs.aws.amazon.com/lambda/latest/dg/limits.html

Proposed solution: set bentoml home dir to '/tmp/bentoml' in AWS lambda deployment

Composing handlers

I'd like to handle multiple content types at once.
For example while uploading image I'd like to also be able to provide some metadata, for example in JSON format.

For now this is impossible as handlers use custom logic inside and it's impossible to combine them.

Alternatives
Subclassing specific handlers every time there is need for specific behavior. This is pretty cumbersome and bug-prone.

Add more testing for py_module_util

https://github.com/bentoml/BentoML/blob/master/bentoml/archive/py_module_utils.py
It's super important and easy to break

Add proper logging utility(for warning/debugging) to the library

Add input column validator to DataframeHandler

Add decorator or option to DataframeHandler indicating expected list of column names, this should throw exception when the dataframe parameter is missing columns during runtime.

Multi-Model Support - Saving/loading Artifact only BentoArchive

Add API to save additional training context metadata and display in YataiService UI

Add API to save extra metadata related to the model training context, and provide a few templates to show context data in BentoML model registry web UI

bento_service = IrisClassifier()
bento_service.pack(
  'model', trained_model, training_context={ params: ..., metrics: ... }
)
bento_service.save()

Add Google cloud storage support

Multi-Model Support - Add documentation/example

Error in Serverless deployment with AWS Lambda

Describe the bug
While running:
!bentoml deploy ./model --platform aws-lambda --region us-west-2

this error appears:
[2019-08-27 17:25:40,854] INFO - Using user AWS region: us-west-2
[2019-08-27 17:25:40,855] INFO - Using AWS stage: dev
Encounter error when deploying to aws-lambda
Error: 'Service Information' is not in list

To Reproduce
sudo npm install [email protected] --global (tried [email protected]. didn't work for the example below too.)

Go to BentoML/examples/deploy-with-serverless/deploy-with-serverless.ipynb
Run all the codes until !bentoml deploy ./model --platform aws-lambda --region us-west-2
See error

Expected behavior
Successful deployment

Environment:

MacOS 10.13.6
serverless 1.49.0
Python 3.7.3
BentoML 0.3.4
ipython 7.6.1

Failing to create and API on Windows

Describe the bug
Failing to create an api on windows ,
bentoml serve {saved_path}

To Reproduce
Steps to reproduce the behavior:

Failing with an error of the fnctl was not found
fixed it by using this and creating a fcntl.py

def fcntl(fd, op, arg=0):
    return 0

def ioctl(fd, op, arg=0, mutable_flag=True):
    if mutable_flag:
        return 0
    else:
        return ""

def flock(fd, op):
    return

def lockf(fd, operation, length=0, start=0, whence=0):
    return

Then the import pwd started failing , for which I dont have any work around till now .

Environment:

OS: [windows 10 ]
Python/BentoML Version [Python3.6]

Additional context
Add any other context about the problem here.

AWS Lambda Deployment: Internal Server Error: No module named BentoML

Describe the bug
Running through the Scikit-learn demo "IrisClassifier" and deploying to AWS Lambda. The deployment is a success and it shows in the Functions dashboard in AWS. But using Postman to test it returns an internal server error, and using the online test evens in AWS Lambda I can see the error message: "Unable to import module 'handler': No module named 'bentoml'",

To Reproduce
Installation of BentoML 0.3.1 /Serverless 1.48.4 is done on macOS Mojave Python 3.7
Running IrisClassifer demo in quick-start-guide

Expected behavior
Running the quick-start-guide successfully

Screenshots/Logs

Environment:

OS: MacOS 10.14.
Python/BentoML Version [Python 3.7.3, BentoML-0.3.1]

Thanks!

Preserve request image files in logs directory

Is your feature request related to a problem? Please describe.
Currently, we don't preserve request image files, https://github.com/bentoml/BentoML/blob/master/bentoml/server/bento_api_server.py#L63

Describe the solution you'd like
Generate a UUID to every image received by api server, log UUID to prediction logs and save image file to logs/images directory and named by the generated UUID

Run flask proxy server with Tensorflow Serving for tf model

In generated Docker image, give user the option to user tensorflow serving as model runtime

Add docs on local development setup to CONTRIBUTING.md

Submit BentoML to conda-forge

https://github.com/conda-forge/staged-recipes

TypeError: 'TfKerasModelArtifact' object is not iterable

Hi there. Here's my packaging code:

%%writefile text_classification_service.py
import pandas as pd
from tensorflow import keras
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import CountVectorizer
from string import digits
from bentoml import api, env, BentoService, artifacts
from bentoml.artifact import TfKerasModelArtifact
from bentoml.handlers import JsonHandler

@artifacts([TfKerasModelArtifact('model')])
@env(conda_dependencies=['tensorflow', 'pandas', 'scikit-learn'])
class TextClassificationService(BentoService):
    
    def vectorizer():
        vectorizer = CountVectorizer(stop_words=None, lowercase=True,
                             ngram_range=(1, 1), min_df=2, binary=True)
        
        train = pd.read_csv('https://raw.githubusercontent.com/Nilabhra/kolkata_nlp_workshop_2019/master/data/train.csv')
        vectorizer.fit_transform(train['text'])
        return vectorizer
    
    def remove_digits(s):
        remove_digits = str.maketrans('', '', digits)
        res = s.translate(remove_digits)
        return res
    
    @api(JsonHandler)
    def predict(self, parsed_json):
        text = parsed_json['text']
        text = remove_digits(text)
        vectorizer = vectorizer()
        text = vectorizer.transform(text)
        prediction =  self.artifacts.model.predict_classes(text)
        response = {'Sentiment': prediction}
        return response

And when I am building the archive using -

from text_classification_service import TextClassificationService

svc = TextClassificationService.pack(model=model)
saved_path = svc.save('/tmp/bento')
print(saved_path)

It gives me the following error (with trace):

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-39d1075ec116> in <module>
      1 from text_classification_service import TextClassificationService
      2 
----> 3 svc = TextClassificationService.pack(model=model)
      4 saved_path = svc.save('/tmp/bento')
      5 print(saved_path)

/miniconda3/lib/python3.7/site-packages/bentoml/service.py in pack(cls, *args, **kwargs)
    320         artifacts = ArtifactCollection()
    321 
--> 322         for artifact_spec in cls._artifacts_spec:
    323             if artifact_spec.name in kwargs:
    324                 artifact_instance = artifact_spec.pack(kwargs[artifact_spec.name])

TypeError: 'TfKerasModelArtifact' object is not iterable

Here is the code for model building, compilation and fitting:

model = keras.Sequential()

model.add(Dropout(rate=0.2, input_shape=features.shape[1:]))
for _ in range(2):
        model.add(Dense(units=64, activation='relu'))
        model.add(Dropout(rate=0.2))
model.add(Dense(units=1, activation='sigmoid'))

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['acc'])

es_cb = keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

model.fit(features,
                    labels,
                    epochs=15,
                    batch_size=512,
                    validation_data=(test_features, test_labels),
                    callbacks=[es_cb],
                    verbose=1)

Is there any ArtiFact I am missing here?

BentoML roadmap overview

This is living thread giving an overview of planned BentoML features on our roadmap - would love to hear your feedback. Join more discussion in our slack channel here: http://bit.ly/2N5IpbB

Support saving to and load from s3 location

e.g.

my_bentoml_model.save("s3://...")

import bentoml
bentoml.load("s3://...")

CURL example
Python Request example

Multiple images in ImageHandler

Is your feature request related to a problem? Please describe.
ImageHandler doesn't support handling multiple images:

return Response(response="Only support single file input", status=400)

Add BentoML cli option to load and run a model archive in path

Implement CliHandler interface for all the built-in handler types
Allow running bentoML model archive directly from CLI:

bentoml run --model-path=/model --input=./testing.csv

bentoml / bentoml Goto Github PK

bentoml's Introduction

BentoML: The Unified Model Serving Framework

Highlights

🍱 Bento is the container for AI apps

🏄 Freedom to build with any AI models

🤖️ Inference optimization for AI applications

🍭 Simplify modern AI application architecture

🚀 Deploy anywhere

Documentation

🛠️ What you can build with BentoML

Getting started

Serving a model locally

Deployment

Community

Contributing

Usage Reporting

License

Citation

bentoml's People

Contributors

Stargazers

Watchers

Forkers

bentoml's Issues

Recommend Projects

Recommend Topics

Recommend Org