Git Product home page Git Product logo

dbt-coverage's Introduction

dbt-coverage

PyPI Downloads GitHub last commit PyPI - Python Version PyPI - Format dbt versions License: MIT

One-stop-shop for docs and test coverage of dbt projects.

Optimized for dbt 1.0, see full support matrix.

Why do I need something like this?

dbt-coverage is to dbt what coverage.py and interrogate are to Python.

It is a single CLI tool which checks your dbt project for missing documentation and tests.

Keeping documentation and tests close to the actual SQL code that generates the final model is one of the best design choices of dbt. It ensures documentation is actually useful and tests are actually used. But how do you make adding those a habit in your dbt project?

That is exactly where dbt-coverage comes in. It will

  • Give you a better sense of the level of documentation and test coverage in your project;
  • Help your CI/CD pipeline make sure new changes include documentation and tests;
  • Let you quickly assess the documentation and tests of a new dbt project you get your hands on.

Still not convinced? Here are some more features:

  • โœจ zero-config: just install it and run it, there is nothing to set up
  • ๐Ÿ minimal dependences: the only dependencies are click (already installed with dbt) and typer
  • ๐Ÿ“ฆ very small: at ~480 SLOC, you can easily validate it works as advertised

Demo

The package was presented during Coalesce, the annual dbt conference, as a part of the talk From 100 spreadsheets to 100 data analysts: the story of dbt at Slido. Watch a demo in the video below.

Demo video

Installation

pip install dbt-coverage

Usage

dbt-coverage comes with two basic commands: compute and compare. The documentation for the individual commands can be shown by using the --help option.

Compute

Compute coverage from target/catalog.json and target/manifest.json files found in a dbt project, e.g. jaffle_shop.

To choose between documentation and test coverage, pass doc or test as the CLI argument.

$ cd jaffle_shop
$ dbt run  # Materialize models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --cov-report coverage-doc.json  # Compute doc coverage, print it and write it to coverage-doc.json file

Coverage report
=====================================================================
jaffle_shop.customers                                  6/7      85.7%
jaffle_shop.orders                                     9/9     100.0%
jaffle_shop.raw_customers                              0/3       0.0%
jaffle_shop.raw_orders                                 0/4       0.0%
jaffle_shop.raw_payments                               0/4       0.0%
jaffle_shop.stg_customers                              0/3       0.0%
jaffle_shop.stg_orders                                 0/4       0.0%
jaffle_shop.stg_payments                               0/4       0.0%
=====================================================================
Total                                                 15/38     39.5%

$ dbt-coverage compute test --cov-report coverage-test.json  # Compute test coverage, print it and write it to coverage-test.json file

Coverage report
=====================================================================
jaffle_shop.customers                                  1/7      14.3%
jaffle_shop.orders                                     8/9      88.9%
jaffle_shop.raw_customers                              0/3       0.0%
jaffle_shop.raw_orders                                 0/4       0.0%
jaffle_shop.raw_payments                               0/4       0.0%
jaffle_shop.stg_customers                              1/3      33.3%
jaffle_shop.stg_orders                                 2/4      50.0%
jaffle_shop.stg_payments                               2/4      50.0%
=====================================================================
Total                                                 14/38     36.8%

Filtering model paths with --model-path-filter

You can also choose a subset of tables to compare using one or multiple --model-path-filter options.

$ cd jaffle_shop
$ dbt run  # Materialize models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --cov-report coverage-doc.json --model-path-filter models/staging/  # Compute doc coverage for a subset of tables, print it and write it to coverage-doc.json file

Coverage report
======================================================
jaffle_shop.stg_customers              0/3       0.0%
jaffle_shop.stg_orders                 0/4       0.0%
jaffle_shop.stg_payments               0/4       0.0%
======================================================
Total                                  0/11      0.0%

$ dbt-coverage compute doc --cov-report coverage-doc.json --model-path-filter models/orders.sql --model-path-filter models/staging/  # Compute doc coverage for a subset of tables, print it and write it to coverage-doc.json file

Coverage report
======================================================
jaffle_shop.orders                     0/9       0.0%
jaffle_shop.stg_customers              0/3       0.0%
jaffle_shop.stg_orders                 0/4       0.0%
jaffle_shop.stg_payments               0/4       0.0%
======================================================
Total                                  0/20      0.0%

Markdown output with --cov-format

You can also choose to print the output in the Markdown table format by specifying the --cov-format option. This can be especially useful when using dbt-coverage in CI/CD pipelines.

$ cd jaffle_shop
$ dbt run  # Materialize models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --model-path-filter models/staging/ --cov-format markdown

# Coverage report
| Model | Columns Covered | % |
|:------|----------------:|:-:|
| jaffle_shop.stg_customers                         |     0/3     |   0.0% |
| jaffle_shop.stg_orders                            |     0/4     |   0.0% |
| jaffle_shop.stg_payments                          |     0/4     |   0.0% |
| Total                                             |     0/11    |   0.0% |

Custom run artifacts path with --run-artifacts-dir

To compute the coverages, dbt-coverage looks up the artefacts from the dbt run execution in the ./target/ folder in the current directory. You can specify a custom path via the --run-artifacts-dir option.

$ dbt-coverage compute doc --run-artifacts-dir jaffle_shop/target --cov-report coverage-doc.json  # Compute doc coverage from the artefacts located in jaffle_shop/target, print it and write it to coverage-doc.json file

Coverage report
================================================
jaffle_shop.customers             0/7       0.0%
jaffle_shop.orders                0/9       0.0%
jaffle_shop.raw_customers         0/3       0.0%
jaffle_shop.raw_orders            0/4       0.0%
jaffle_shop.raw_payments          0/4       0.0%
jaffle_shop.stg_customers         0/3       0.0%
jaffle_shop.stg_orders            0/4       0.0%
jaffle_shop.stg_payments          0/4       0.0%
================================================
Total                             0/38      0.0%

Compare

Compare two coverage.json files generated by the compute command. This is useful to ensure that the coverage does not drop while making changes to the project.

$ dbt-coverage compare coverage-after.json coverage-before.json

# Coverage delta summary
              before     after            +/-
=============================================
Coverage      39.47%    38.46%         -1.01%
=============================================
Tables             8         8          +0/+0
Columns           38        39          +1/+0
=============================================
Hits              15        15          +0/+0
Misses            23        24          +1/+0
=============================================

# New misses
=========================================================================
Catalog                         15/38   (39.47%)  ->    15/39   (38.46%)
=========================================================================
- jaffle_shop.customers          6/7    (85.71%)  ->     6/8    (75.00%)
-- new_col                       -/-       (-)    ->     0/1     (0.00%)
=========================================================================

Combined use-case

$ cd my-dbt-project

$ dbt run  # Materialize models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --cov-report before.json --cov-fail-under 0.5  # Fail if coverage is lower than 50%

# Make changes to the dbt project, e.g. add some columns to the DWH, document some columns, etc.

$ dbt run  # Materialize the changed models
$ dbt docs generate  # Generate catalog.json and manifest.json
$ dbt-coverage compute doc --cov-report after.json --cov-fail-compare before.json  # Fail if the current coverage is lower than coverage in before.json
$ dbt-coverage compare after.json before.json  # Generate a detailed coverage delta report

Supported dbt versions

Different version of dbt-coverage support different versions of dbt. Here is the support matrix.

dbt dbt-coverage
<0.20 not tested
0.20 - 0.21 0.1
1.0 - 1.7 0.2, 0.3

Related packages

Contributing

Clone this repo including submodules, create a virtual environment and install dependencies:

git clone --recurse-submodules [email protected]:slidoapp/dbt-coverage.git
cd dbt-coverage
pip install poetry
poetry shell
poetry install
pre-commit install

To run all integration tests locally, run:

tox

License

Licensed under the MIT license (see LICENSE.md file for more details).

FOSSA Status

dbt-coverage's People

Contributors

sweco avatar followingell avatar mrshu avatar pgoslatara avatar vvvito avatar cquad avatar fszta avatar gjmcclintock avatar bbrewington avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.