seldonio / seldon-core Goto Github PK
View Code? Open in Web Editor NEWAn MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
Home Page: https://www.seldon.io/tech/products/core/
License: Other
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
Home Page: https://www.seldon.io/tech/products/core/
License: Other
Look at integrating manually or automatically for releases/pull requests
I was looking at deploying one of the examples (iris) using Python 3 but I don't think its possible to do it at the moment.
I changed the base image using --base-image=python:3
but because of one of the requirements in seldon_requirements.txt
more specifically grpc
the image cannot be built since that library only works on python 2.
Step 14/19 : RUN cd /tmp && pip install --no-cache-dir -r seldon_requirements.txt && pip install --no-cache-dir -r requirements.txt
---> Running in ee334e1dbc4f
Collecting numpy==1.11.2 (from -r seldon_requirements.txt (line 1))
Downloading numpy-1.11.2.tar.gz (4.2MB)
Collecting pandas==0.18.1 (from -r seldon_requirements.txt (line 2))
Downloading pandas-0.18.1.tar.gz (7.3MB)
Collecting grpc==0.3.post19 (from -r seldon_requirements.txt (line 3))
Downloading grpc-0.3-19.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-he7l6xxx/grpc/setup.py", line 7, in <module>
version_tuple = __import__('grpc').VERSION
File "/tmp/pip-build-he7l6xxx/grpc/grpc/__init__.py", line 6, in <module>
from .rpc import *
File "/tmp/pip-build-he7l6xxx/grpc/grpc/rpc.py", line 141
except OSError, ex:
^
SyntaxError: invalid syntax
I was also wondering if this requirement might be easy to remove since this is another RPC library and not grpc.io, thats is also in the list of seldon_requirements.txt
.
load tests to show
The graph can be found in notebooks/resources/epsilon_greedy.json
When the graph has just been deployed and a first request is sent, it takes a long time and sometimes fails. The second request onward works fine.
When looking into the microservice logs it appears that the request was sent four times to the router and successive models. The issue can be reproduced by following the epsilon greedy example notebook:
notebooks/epsilon_greedy.ipynb
There are issues to create wrappers for R and Spark, so adding PyTorch here too as it was raised the community Slack earlier today. This would be a great first PR for anyone looking to contribute!
presently these are part of the SeldonDeployment resource manifest. Is this sufficient with RBAC or should secrets be used?
The actual behaviour should be:
If implementation is specified, ignore the rest
if not, implementation defaults to UNKNOWN_IMPLEMENTATION and
if type is specified ignore the rest
if not, type defaults to UNKNOWN_TYPE and
if methods is not specified, raise an error (we are in the case when none of implementation, type, methods has been specified)
expose R models with a thin REST server or gRPC server that respects the internal model API
Build Docker image
The APIs are presently described as gRPC/proto files.
Translating automatically to OpenAPI does not seem possible, see issue on OpenAPI project
There is an OpenAPI to proto Converter project so one possibility is to see how close a created manual OpenAPI spec of the current gRPC spec when sent through this recreates the existing gRPC spec. If this was identical we could think of just keeping the OpenAPI spec and generating the proto file.
When applying the seldon deployment below on a GCP cluster, the main pod status stays on pending (the canary works) and shows the following warning:
Warning FailedScheduling No nodes are available that match all of the predicates: Insufficient cpu (5).
The cluster has 5 nodes.
seldon deployment JSON:
{
"apiVersion": "machinelearning.seldon.io/v1alpha1",
"kind": "SeldonDeployment",
"metadata": {
"labels": {
"app": "seldon"
},
"name": "seldon-deployment-example"
},
"spec": {
"annotations": {
"project-name":"FX Market Prediction",
"deployment_version": "v1"
},
"name": "test-deployment-complex",
"oauth_key": "oauth-key",
"oauth_secret": "oauth-secret",
"predictors": [
{
"componentSpec": {
"spec": {
"containers": [
{
"image": "seldonio/mean_classifier:0.6",
"imagePullPolicy": "IfNotPresent",
"name": "mean-classifier-1",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
},
{
"image": "seldonio/mean_classifier:0.6",
"imagePullPolicy": "IfNotPresent",
"name": "mean-classifier-2",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
},
{
"image": "seldonio/mean_classifier:0.6",
"imagePullPolicy": "IfNotPresent",
"name": "mean-classifier-3",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
},
{
"image": "seldonio/mock_outlier_detector:1.0",
"imagePullPolicy": "IfNotPresent",
"name": "outlier-detector",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
},
{
"image": "seldonio/mock_transformer:1.0",
"imagePullPolicy": "IfNotPresent",
"name": "mean-transformer",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
},
"name": "fx-market-predictor",
"replicas": 1,
"annotations": {
"predictor_version": "v1"
},
"graph": {
"name": "outlier-detector",
"type": "TRANSFORMER",
"endpoint": {
"type": "REST"
},
"children": [
{
"name": "random-abtest",
"implementation": "RANDOM_ABTEST",
"type": "UNKNOWN_TYPE",
"children": [
{
"name": "mean-transformer",
"type": "TRANSFORMER",
"endpoint": {
"type": "REST"
},
"children": [
{
"name": "mean-classifier-1",
"type": "MODEL",
"endpoint": {
"type": "REST"
}
}
]
},
{
"name": "ensemble",
"type": "UNKNOWN_TYPE",
"implementation": "AVERAGE_COMBINER",
"children": [
{
"name": "mean-classifier-2",
"type": "MODEL",
"endpoint": {
"type": "REST"
}
},
{
"name": "mean-classifier-3",
"type": "MODEL",
"endpoint": {
"type": "REST"
}
}
]
}
]
}
]
}
},
{
"componentSpec": {
"spec": {
"containers": [
{
"image": "seldonio/mean_classifier:0.6",
"imagePullPolicy": "IfNotPresent",
"name": "mean-classifier",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
},
{
"image": "seldonio/mock_transformer:1.0",
"imagePullPolicy": "IfNotPresent",
"name": "mean-transformer",
"resources": {
"requests": {
"memory": "1Mi",
"cpu": "0.1"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
},
"name": "fx-market-predictor-canary",
"replicas": 1,
"annotations": {
"predictor_version": "v1"
},
"graph": {
"name": "mean-transformer",
"type": "TRANSFORMER",
"endpoint": {
"type": "REST"
},
"children": [
{
"name": "mean-classifier",
"endpoint": {
"type": "REST"
},
"type": "MODEL"
}
]
}
}
]
}
}
Full end to end training and wrapping and deployment using GitOps techniques in a CI/CD pipeline.
API front end must be caching the gRPC endpoint and not removing it when the custom resource is deleted.
It must also not be updating it when a new custom resource is created with same oauth keys.
The real identifier for a Seldon Deployment is in the metadata. The spec.name attribute is redundant.
The cluster manager should not use it anymore.
The image is missing binary files for the nd4j library
Allow the Prometheus Stack to deploy into a separate cluster from sedon-core
Possibly part of a refactoring of the routing information in the response metadata
expose Spark standalone runtime models with a thin REST server or gRPC server that respects the internal model API
Build Docker image
Needs to be updated for the latest protos. The graph section is incorrect has still has subType field.
@errordeveloper submitted a PR #51 that avoids local dependencies for this particular example.
seldon-core/examples/models/sklearn_iris_docker
Update the docs to reflect this
Useful for things like explainers
This will be first step to allow us to easily integrate into KubeFlow, see kubeflow/kubeflow#159
For example the diagram in:
https://github.com/SeldonIO/seldon-core/blob/master/docs/reference/prediction.md
is deprecated
The plugin would respect the Combiner internal API spec and add meta data to state whether the input was an outlier or not.
The plugin would respect the Combiner internal API spec and add meta data to state when it determines concept drift is ocurring.
Will implemnt LIME, plus other techniques and modify meta-data for result.
Add top level state to show when seldonDeployment is running as normal? replicasAvailable = replicas for each predictor?
The MAB plugins would respect the internal Router API.
Epsilon greedy could be initial and then more complex algorithms.
Redis could be used for state.
To reproduce:
requests: {
"cpu": 0.1
}
(0.1 should be a string not a float)
Delete the seldon deployment. Now the cluster-manager is broken and logs the following error repeatedly:
2018-01-17 15:56:24.385 ERROR 5 --- [pool-1-thread-1] o.s.s.s.TaskUtils$LoggingErrorHandler : Unexpected error occurred in scheduled task.
com.google.protobuf.InvalidProtocolBufferException: Can't decode io.kubernetes.client.proto.resource.Quantity from 0.1
at io.seldon.clustermanager.pb.QuantityUtils$QuantityParser.merge(QuantityUtils.java:63) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1241) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMapField(JsonFormat.java:1484) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1458) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1462) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeRepeatedField(JsonFormat.java:1541) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1460) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1462) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1462) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeRepeatedField(JsonFormat.java:1541) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1460) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.parseFieldValue(JsonFormat.java:1797) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeField(JsonFormat.java:1462) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.mergeMessage(JsonFormat.java:1294) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1252) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$ParserImpl.merge(JsonFormat.java:1126) ~[classes!/:0.3.1]
at io.seldon.clustermanager.pb.JsonFormat$Parser.merge(JsonFormat.java:305) ~[classes!/:0.3.1]
at io.seldon.clustermanager.k8s.SeldonDeploymentUtils.jsonToSeldonDeployment(SeldonDeploymentUtils.java:44) ~[classes!/:0.3.1]
at io.seldon.clustermanager.k8s.SeldonDeploymentWatcher.watchSeldonMLDeployments(SeldonDeploymentWatcher.java:123) ~[classes!/:0.3.1]
at io.seldon.clustermanager.k8s.SeldonDeploymentWatcher.watch(SeldonDeploymentWatcher.java:146) ~[classes!/:0.3.1]
at sun.reflect.GeneratedMethodAccessor112.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_131]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_131]
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) ~[spring-context-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-4.3.6.RELEASE.jar!/:4.3.6.RELEASE]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_131]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_131]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
RBAC is active by default for 1.8> clusters.
A service account will probably be needed but also use of GCPAuthenticator
https://github.com/errordeveloper/seldon-gitops
The setups steps haven't been documented in the repo yet, happy to walk through it.
One thing to note is that I've done some more work extending what I have already contributed via #51. Namely, I added GCB config and optimised the build to use cache properly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.