Git Product home page Git Product logo

nd0821-c3-starter-code's Introduction

Working in a command line environment is recommended for ease of use with git and dvc. If on Windows, WSL1 or 2 is recommended.

Environment Set up

  • Download and install conda if you don’t have it already.
    • Use the supplied requirements file to create a new environment, or
    • conda create -n [envname] "python=3.8" scikit-learn pandas numpy pytest jupyter jupyterlab fastapi uvicorn -c conda-forge
    • Install git either through conda (“conda install git”) or through your CLI, e.g. sudo apt-get git.

Repositories

  • Create a directory for the project and initialize git.
    • As you work on the code, continually commit changes. Trained models you want to use in production must be committed to GitHub.
  • Connect your local git repo to GitHub.
  • Setup GitHub Actions on your repo. You can use one of the pre-made GitHub Actions if at a minimum it runs pytest and flake8 on push and requires both to pass without error.
    • Make sure you set up the GitHub Action to have the same version of Python as you used in development.

Data

  • Download census.csv and commit it to dvc.
  • This data is messy, try to open it in pandas and see what you get.
  • To clean it, use your favorite text editor to remove all spaces.

Model

  • Using the starter code, write a machine learning model that trains on the clean data and saves the model. Complete any function that has been started.
  • Write unit tests for at least 3 functions in the model code.
  • Write a function that outputs the performance of the model on slices of the data.
    • Suggestion: for simplicity, the function can just output the performance on slices of just the categorical features.
  • Write a model card using the provided template.

API Creation

  • Create a RESTful API using FastAPI this must implement:
    • GET on the root giving a welcome message.
    • POST that does model inference.
    • Type hinting must be used.
    • Use a Pydantic model to ingest the body from POST. This model should contain an example.
    • Hint: the data has names with hyphens and Python does not allow those as variable names. Do not modify the column names in the csv and instead use the functionality of FastAPI/Pydantic/etc to deal with this.
  • Write 3 unit tests to test the API (one for the GET and two for POST, one that tests each prediction).

API Deployment

  • Create a free Heroku account (for the next steps you can either use the web GUI or download the Heroku CLI).
  • Create a new app and have it deployed from your GitHub repository.
    • Enable automatic deployments that only deploy if your continuous integration passes.
    • Hint: think about how paths will differ in your local environment vs. on Heroku.
    • Hint: development in Python is fast! But how fast you can iterate slows down if you rely on your CI/CD to fail before fixing an issue. I like to run flake8 locally before I commit changes.
  • Write a script that uses the requests module to do one POST on your live API.

nd0821-c3-starter-code's People

Contributors

abhiojha8 avatar fa-ahmad avatar justcliffsmith avatar sudkul avatar yabd89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nd0821-c3-starter-code's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.