Git Product home page Git Product logo

yu-iskw / dbt-artifacts-loader Goto Github PK

View Code? Open in Web Editor NEW
17.0 3.0 0.0 1.14 MB

Load dbt artifacts uploaded to GCS to BigQuery in order to track historical dbt results

Home Page: https://yu-ishikawa.medium.com/data-status-time-machine-on-persisted-dbt-artifacts-a745bc753db3

License: Apache License 2.0

Dockerfile 0.05% Makefile 0.25% Shell 2.71% Python 93.78% HCL 3.21%
dbt bigquery data-quality data-management terraform google-cloud google-cloud-platform gcp data-quality-monitoring data-monitoring

dbt-artifacts-loader's Introduction

dbt-artifacts-loader

It enables us to persist dbt artifacts uploaded on GCS to BigQuery with Cloud Run.

Overview

How to use

  1. We push a docker image to Google Container Registry
  2. We apply the terraform resources to launch the Cloud Run application.
  3. We upload dbt artifacts JSON files to Google Cloud Storage.
  4. We create the dbt models to efficiently analyze dbt artifacts.

Requirements

  • docker
  • terraform
  • dbt-bigquery

Push a docker image to Google Container Registry

Cloud Run accepts only a docker image on Google Cloud as Google Container Registry and Artifacts Registry.

export project="YOUR-PROJECT-ID"
docker build --rm -f Dockerfile  -t "gcr.io/${YOUR-PROJECT-ID}/dbt-artifacts-loader:latest" .
docker push "gcr.io/${YOUR-PROJECT-ID}/dbt-artifacts-loader:latest"

Apply the terraform resources to launch the Cloud Run application.

The terraform/example directory enables us to create required GCP resources and launch the Cloud Run application. It creates:

  • a GCS bucket to store dbt artifacts JSON files,
  • a BigQuery dataset and tables to store dbt artifacts,
  • a Pus/Sub topic and a corresponding Pub/Sub subscription to notify file uploads to Cloud Run, and
  • a Cloud Run application to load uploaded dbt artifacts to BigQuery.

The terraform project asks us to specify a GCP project ID to allocate the resources and a location of GCS, BigQuery and Cloud Run. terraform apply shows the prompt to enter the two. If you want to fix them, please edit the default values in variables.tf.

cd terraform/example
terraform init
terraform apply

Upload dbt artifacts JSON files to Google Cloud Storage.

We have set up the Cloud Run application to load dbt artifacts JSON files to BigQuery. Now, let's upload a run_results.json to GCS manually.

gsutil cp "target/run_results.json" "gs://${project}-dbt-artifacts/"

The function which the repository doesn't contain is to upload dbt artifacts JSON file. I would recommend you automate uploading JSON files to GCS after executing dbt.

Create the dbt models to efficiently analyze dbt artifacts.

We have almost done. The last thing we have to do is to create BigQuery tables with dbt, so that we efficiently analyze dbt artifacts.

To do that, we have to edit:

cd dbt_artifacts_models
# Set up the environment for dbt
make setup
# Create and test the dbt models
make run && make test

Links

dbt-artifacts-loader's People

Contributors

yu-iskw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dbt-artifacts-loader's Issues

Security Policy violation SECURITY.md

This issue was automatically created by Allstar.

Security Policy Violation
Security policy not enabled.
A SECURITY.md file can give users information about what constitutes a vulnerability and how to report one securely so that information about a bug is not publicly visible. Examples of secure reporting methods include using an issue tracker with private issue support, or encrypted email with a published key.

To fix this, add a SECURITY.md file that explains how to handle vulnerabilities found in your repository. Go to https://github.com/yu-iskw/dbt-artifacts-loader/security/policy to enable.

For more information, see https://docs.github.com/en/code-security/getting-started/adding-a-security-policy-to-your-repository.


This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Security Policy violation Branch Protection

This issue was automatically created by Allstar.

Security Policy Violation
Dismiss stale reviews not configured for branch main
Block force push not configured for branch main


This issue will auto resolve when the policy is in compliance.

Issue created by Allstar. See https://github.com/ossf/allstar/ for more information. For questions specific to the repository, please contact the owner or maintainer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.