aisingapore / kapitan-hull Goto Github PK

View Code? Open in Web Editor NEW

5.0 3.0 0.0 8.27 MB

The one-stop shop of a Cookiecutter template to spin up a working AISG project in minutes.

Home Page: https://aisingapore.github.io/kapitan-hull/

Python 1.84% HTML 97.72% Dockerfile 0.26% Makefile 0.02% Batchfile 0.03% Jupyter Notebook 0.12%

cookiecutter cookiecutter-python cookiecutter-python3 cookiecutter-template machine-learning ml mlops ai-singapore

kapitan-hull's People

Contributors

Stargazers

Watchers

kapitan-hull's Issues

[Bug]: Batch Inferencing's `hydra.job.chdir=True` parameter doesn't write `batch-infer-res.jsonl` into the `outputs` folder

Problem Domain

Python

Problem Brief

The section on batch inferencing would show that hydra.job.chdir=True parameter could be used to write batch-infer-res.jsonl into the outputs folder. But this doesn't seem to happen.

Steps to Reproduce

Run the following snippet from https://aisingapore.github.io/kapitan-hull/guide-for-user/09-batch-inferencing/ (with modification specified at the end of the page):

python src/batch_inferencing.py \
    hydra.job.chdir=True \
    batch_infer.model_path=$PRED_MODEL_PATH \
    batch_infer.input_data_dir="$PWD/data/batched-mnist-input-data"

Expected Result

batch-infer-res.jsonl written in outputs/<date>/<time>/ folder

Actual Result

batch-infer-res.jsonl not being saved anywhere that I know of

Logtrace

❯ python src/batch_inferencing.py hydra.job.chdir=True batch_infer.model_path=$(pwd)/models/model.pt
[2024-01-17 14:18:52,269][__main__][INFO] - Setting up logging configuration.
{"asctime": "2024-01-17T14:18:52+0800", "process": 3240, "name": "__main__", "levelname": "INFO", "message": "Loading the model..."}
{"asctime": "2024-01-17T14:18:52+0800", "process": 3240, "name": "__main__", "levelname": "INFO", "message": "Conducting inferencing on image files..."}
{"asctime": "2024-01-17T14:18:52+0800", "process": 3240, "name": "__main__", "levelname": "INFO", "message": "Batch inferencing has completed."}
{"asctime": "2024-01-17T14:18:52+0800", "process": 3240, "name": "__main__", "levelname": "INFO", "message": "Output result location: /home/nus/Codebase/aiap/aiap-15-test/aiap-dsp-mlops/outputs/2024-01-17/14-18-52/batch-infer-res.jsonl"}

[Bug]: GitLab WebUI pipeline borked due to DAG implementation

Problem Domain

GitLab

OS/Platform(s) Used

None

Problem Brief

After implementing the DAG pipelines in the template, it broke the functionality of running pipelines manually.

Steps to Reproduce

Use the web UI pipeline in Gitlab

Expected Result

Pipeline to successfully run

Actual Result

Error in running the pipeline due to the use of needs and subsequent jobs would not run if the job that is depended on doesn't require to be executed during the pipeline run.

Logtrace

No response

[Feature]: Add a code profiler

Query Domain

Python

Query Brief

Add a code profiler log the speed of every process for transparency sake

Example of a Python profiler:

[Refactor]: Switch JSON replay files to YAML

Query Domain

Cookiecutter

Query Brief

It has been tested that YAML files can also be used as cookiecutter replay files, so this would be refactored in 0.4.0 as part of a cosmetic change to be in line with other configuration files in this repository.

High Priority Updates

Add YAML files documentation
Add Coder-specific step-by-step into the documentation
~~Change example to use Scikit-learn over Pytorch~~
- Keeping Pytorch in favour of the use of checkpointing and getting them to know Pytorch more before their project phases

[Feature]: Backporting GCP components

Query Domain

Docker registry
Run:ai
MLFlow
Kubernetes

Objectives

Backport components from the GCP-Run:ai repository to this repository
Check that it doesn't unintentionally change onprem-runai components
Execution from guide with no issues

[Refactor]: Using `general_utils.mlflow_init` function in `mlflow_test.py`

Query Domain

Python

Query Brief

Currently mlflow_test.py is written independently of the rest of the scripts. This refactoring would use general_utils.mlflow_init function to test the use of that function on top of testing the connection to the MLFlow server created/given.

[Feature]: Adding local machine instructions to Job Orchestration section

Query Domain

Local machine (Windows/Mac/Linux/etc.)

Query Brief

While experience users can infer the commands to run the scripts locally, this is so that the guide is more explicit on this.

[Explore]: Technology exploration (Package Management)

Query Domain

Python
Git

Query Brief

Exploring various technologies to incorporate into the Kapitan Hull stack:

[Bug]: Switch from artifacts to cache for `test:conda-build` job's generated objects to be used across pipelines

Problem Domain

Gitlab

OS/Platform(s) Used

None

Problem Brief

Currently the test:conda-build section only saves the conda environment as artifacts, which only saves within the same pipeline. But the environment doesn't need to change unless the conda yaml file changes as well. Thus, we will test whether using cache instead of artifacts would be better suited to store the environment so that we don't have to rebuild the environment every pipeline.

Steps to Reproduce

CI/CD pipeline being triggered upon push of modified files

Expected Result

No need test-conda to run if conda yml file is unmodified.

Actual Result

test-conda has to run otherwise pylint-pytest job will fail.

[Feat]: ipynb replacement

Query Domain

Query Brief

To look into alternatives for .ipynb files as notebook changes often introduces merge headaches.

Considerations

https://github.com/mwouts/jupytext

[Feature]: Add legacy Polyaxon section for compatibility

Query Domain

Kubernetes

Query Brief

To implement Polyaxon section for legacy/open-source purposes

[Bug]: Default issue tracker template for Gitlab to not be the default template used

Problem Domain

Gitlab

OS/Platform(s) Used

None

Problem Brief

Having the current issue template to be the default in Gitlab breaks some conventions with regards to how the issue tracker is being used across different teams. So it has to be that the issue template should be an option instead of being the default.

Steps to Reproduce

Create a new issue in Gitlab after creation of a repository using this template

Expected Result

Empty template

Actual Result

Filled template with it being the default

[Bug]: For `cv` problem template, removing `cpuonly` package doesn't enable GPU; CPU and GPU images does not work as intended

Problem Domain

Docker
Python

OS/Platform(s) Used

None

Problem Brief

The cv problem template is meant to be used as an example for AIAP MLOps Week. And within the week, GPUs are not given to be used, so this issue would fly under the radar during the session. However, if GPUs are used for the guide, just removing the cpuonly package doesn't make that happen as the Pytorch version installed uses a build that only uses CPU even if GPU and and CUDA are available within the container. Thus, the current Dockerfile used to create GPU images is redundant and bloated since the Pytorch package wouldn't use the GPUs attached to the container.

Steps to Reproduce

Attach GPUs to the container created using the *-gpu.Dockerfile with the cv problem template
Run Python/iPython within the container
Run import torch; torch.cuda.is_available()

Expected Result

true

Actual Result

false

[Feature]: pre-commit hooks

Query Domain

Query Brief

To align and integrate pre-commit hooks as part of the base Kapitan Hull template.

[Refactor]: Backporting Components from `ml-project-cookiecutter-gcp`

Query Domain

Docker
Kubernetes
Python

Query Brief

ml-project-cookiecutter-gcp has various features that the current Kapitan Hull doesn't have:

Web UI (Streamlit)
Docker images for:
- deployment
- batch inferencing
- Streamlit

Thus, there's still room to reach parity from prior prototype versions of Kapitan Hull.

[Feature]: Time Series Problem Template

Query Brief

To create a problem template specifically for time series problems in reducing the time taken to generate a working pipeline.

[Feature]: To remove Pytorch example and separate it as an example section in the documentation

Query Brief

The Pytorch code would be pulled out from the codebase and to be downloaded separately and replace the template generation. There should not be any errors if the prompts are filled in correctly that are not to be personalised (Docker registry name, author name, project name, etc.). This is so that the template becomes package-agnostic, and hopes to reduce any confusion following the guide.

Tasks

Include post-gen-project hook to include source codes for example problem(s)
- CV problem w. mnist dataset
Write up base template code
Change the guide site since it's different for each problem statement
Test the base template
Write up the underlying changes made with PR #24 on top of the feature that is to be implemented from this issue tracker

[Feature]: CI/CD scaffolding

Move PVC creation section to Kapitan Hull Admin
Banners generated based on what platform/orchestrator is used
The sample GitHub/GitLab pages to be replaced by something other than the cookiecutter placeholders
CI/CD for the generated template
- Add back the UI component to interact with the model (Stick to Streamlit first, Gradio later)
- Depending on the orchestrator, set the CI job to manually run a job to process data/train model/deploy UI instead of relying on commands
  - This assumes a couple of variables are set up on the template repo for it to run properly; to be
CI/CD for the main repo
- Set up web pipeline to populate the 100E projects directly from the repo

[Postmortem]: AIAP 15 MLOps Week

Moved to GitLab: https://gitlab.aisingapore.net/mlops/kapitan-hull/-/issues/5

aisingapore / kapitan-hull Goto Github PK

kapitan-hull's People

Contributors

Stargazers

Watchers

kapitan-hull's Issues

Problem Domain

Problem Brief

Steps to Reproduce

Expected Result

Actual Result

Logtrace

Problem Domain

OS/Platform(s) Used

Problem Brief

Steps to Reproduce

Expected Result

Actual Result

Logtrace

Query Domain

Query Brief

Query Domain

Query Brief

Query Domain

Objectives

Query Domain

Query Brief

Query Domain

Query Brief

Query Domain

Query Brief

Problem Domain

OS/Platform(s) Used

Problem Brief

Steps to Reproduce

Expected Result

Actual Result

Query Domain

Query Brief

Query Domain

Query Brief

Problem Domain

OS/Platform(s) Used

Problem Brief

Steps to Reproduce

Expected Result

Actual Result

Problem Domain

OS/Platform(s) Used

Problem Brief

Steps to Reproduce

Expected Result

Actual Result

Query Domain

Query Brief

Query Domain

Query Brief

Query Brief

Query Brief

Tasks

Recommend Projects

Recommend Topics

Recommend Org