Git Product home page Git Product logo

senniors-brain-services's Introduction

Brain-services

This repo is the main brain for the Senniors test.

The previous exploration of the data was made in the Senniors-data-explorer repo, where you can check some visualizations of the provided data.

Assumptions from exploration

From the exploration we may make some assumptions in order to clean and to train our models.

  • Value of age can not be lower than 0, if it is the case it will be swapper to 0
  • Dates of appointment and scheduled can be swapped if it has not sensed, time travelling for customers not allowed (yet :D)
  • Due to the small amount of data we will suppose that months are not relevant
  • Due to the small amount of data we will suppose we do not know all the possible neighbours
  • All Nan data will be dropped
  • We suppose only 2 possible genders

Due to the small amount of data we do not want to fix from the beginning of the project the relevant variables, so we will use PCA optimizations with different values of relevant components (the number can be updated in the settings of celery_brain_services module).

Also, the model for the classificator will not be fixed from the beginning of the project, so we will train different models and check each time which one has better results.

The models to use are the following:

  • DecisionTreeClassifier
  • GaussianNB
  • KNeighborsClassifier
  • RandomForestClassifier

The pipeline with the best result will be stored in the TRAINED_FILE_PATH path (you can update this value in .env file)

Requirements

The following requirements are needed:

  • docker
  • docker-compose

But in order to perform debug tasks and also to simplify the work is highly recommended to have also these dependencies:

  • python >=3.9
  • make
  • pipenv
  • poetry

Installation

If you have the minimum requirements, run the project is a piece of cake, just run the following:

docker-compose up -d

And if you have make, is even easier:

make

And voila!, your project will be running in a docker container

How it works?

All the project is dockerized, and it has not extra requirements, so it can be deployed easily anywhere.

The main architecture of the service is as following:

The pipeline to use is not just one. The main process to train pipelines is in src/celery_brain_services/tasks/train_and_select_pipelines.py

It will use different models and different PCA optimizations, and check which has the best predictions on the test data, so, maybe when the data grow up, the best pipeline can change.

In order to optimize resources if a model with fewer components has the same scoring rounded to the second decimal, it will be the "winner".

All the pipelines must pass through a data clean transformations. You can check all these transformers in the module brain_services.transformers

Tests

The project only has unit tests, and to check that all tests are ok just exec the following:

make test

or

docker exec senniors_brain_services python -m pytest --log-cli-level=ERROR --disable-pytest-warnings

Swagger

You can check the api documentation in the following endpoints:

  • /senniors_brain_services_documentation
  • /senniors_brain_services_redoc

You can change the endpoints updating the following env variables:

  • DOCS_URL
  • REDOC_URL

If you set the env values to 'None', the documentation wil be hidden (this is recommended for production environments)

Next steps

  • First we should create a CI/CD, maybe with jenkins and a Jenkinsfile, that pass all the tests if needed and deploy the project in different environments.
  • Then we need to create E2E test cases, maybe with postman and a newman container.
  • Also, it must be integrated with the backend databases. Instead of using a csv it should extract the information from a database shared with the backend.
  • Instead of saving the pipelines in local, it should be uploaded to an S3 instance.

senniors-brain-services's People

Contributors

martingaldeca avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.