- An Openshift 4.11+ cluster
- The Pipelines Operator
- The
oc
andtkn
command line tools (See the question mark menu in the Openshift UI) - An Openshift project to work with
├── src Python source for data ingestion and model training
├── pipelines Tekton pipeline and tasks
├── data Sample data
├── notebooks Jupyter experimentation
└── requirements.txt Python dependencies
cd pipelines
Apply the custom tasks and pipeline resources.
oc apply -f pipelines/01-ingest-train-task.yaml
oc apply -f pipelines/02-ingest-train-pipeline.yaml
-> Use the Openshift UI to manually create a storage persistent volume claim (PVC) and pass its name in when starting the pipeline below.
tkn pipeline start ingest-and-train -w name=shared-workspace,claimName=my-pipeline-claim-01 -p deployment-name=ingest-and-train -p git-url=https://github.com/bkoz/stock.git -p IMAGE='image-registry.openshift-image-registry.svc:5000/$(context.pipelineRun.namespace)/ingest-and-train' --use-param-defaults
This requires further investigation as the PVCs don't get deleted when the pipeline gets deleted.
tkn pipeline start ingest-and-train -w name=shared-workspace,volumeClaimTemplateFile=00-persistent-volume-claim.yaml -p deployment-name=ingest-and-train -p git-url=https://github.com/bkoz/stock.git -p IMAGE='image-registry.openshift-image-registry.svc:5000/$(context.pipelineRun.namespace)/ingest-and-train' --use-param-defaults
- Integrate s3 storage into the
ingest
andtraining
tasks.- Ingest the csv file from s3 vs. the yahoo finance service.
- Save the trained model artifact to s3 storage so the Triton server can find it.