Git Product home page Git Product logo

dbt-kube-demo's Introduction

dbt-kube-demo

A demo of how to use K8s CronJobs to automate DBT pipelines.

You'll find here all you need to run a simple DBT project over Kubernetes with BiGQuery as a SQL backend.

Prerequisites

We use several tools in order to run this project.

❗ Mandatory tools

  • python and poetry in order to generate and upload some fake data and run DBT locally if you want.
  • gcloud CLI to interact with GCP. Installation guide here
  • kubectl to interact with you Kubernetes cluster. Installation with gcloud here
  • helm to install and manages Kubernetes resources. Installation guide here
  • terraform to deploy and manage the infrastructure. Installation guide here
  • docker to build and push the DBT image to GCP Artifact Registry. Installation guide here

ℹ️ Optional tools

  • k9s, a TUI to visualize, explore and interact with the cluster
  • kubens to set a default namespace when running kubectl commands
  • kubectx to handle multiple context with kubectl

Installation steps

GCP Project

The very first step is to setup a new GCP project. Once this is done, you can start deploying your infrastructure.

Infrastructure

We use Terraform to deploy and manage the infrastructure: all can be done without clicking on the GCP Console.

Follow the README in the terraform folder.

Export some fake data

There are 2 scripts in data folder that generates and upload to GCS some fake data you can use. You can have a look at the data it generates by inspect the 2 .ndjson files.

Make sure you have installed the project with poerty install and just run:

poetry run pyhton data/data_faker.py
poetry run python data/storage_load.py

Local run of DBT (optional)

You can run DBT locally if you want to check that DBT can correctly run your SQL queries on BigQuery.

You'll need to define your profile.yml in order to make DBT able to connect and query BigQuery. The default profile name is kube_bq_jobs. You can find an example in the ConfigMap here.

Deploy your CronJob to Kubernetes

Follow the README in the kubernetes/helm folder

dbt-kube-demo's People

Contributors

ebenkara15 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.