Git Product home page Git Product logo

Comments (4)

nathangriffiths-cdx avatar nathangriffiths-cdx commented on June 2, 2024 1

@noel

Environment:
dbt-bigquery 1.5.6
Python 3.10.10
Gitbash on Windows 11

Git repo structure:
project_root_folder/dbt_project_folder

dbt_project_folder contains an existing dbt project with a large number of non-compliant files.

Steps
In project_root_folder:
Create the file .dbt-checkpoint.yaml
Update the file .pre-commit-config.yaml

.dbt-checkpoint.yaml

version: 1
dbt-project-dir: dbt_project_folder

.pre-commit-config.yaml

-   repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.1.1
    hooks:
    - id: dbt-compile
    - id: dbt-docs-generate
    - id: check-column-desc-are-same
    - id: check-model-columns-have-desc
    - id: check-model-has-all-columns
    - id: check-model-has-description
    - id: check-model-has-properties-file
    - id: check-script-semicolon
    - id: check-script-has-no-table-name
    - id: check-script-ref-and-source
    - id: check-source-columns-have-desc
    - id: check-source-has-all-columns
    - id: check-source-table-has-description
    - id: check-macro-has-description
    - id: check-macro-arguments-have-desc
$ git status
On branch dbt-checkpoint-implementation
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   .dbt-checkpoint.yaml
        modified:   .pre-commit-config.yaml

Now commit the changes:
git commit -m "some commit message"

Output:

check yaml....................................................................Passed
check json................................................(no files to check)Skipped
fix end of files..............................................................Passed
trim trailing whitespace......................................................Passed
isort.....................................................(no files to check)Skipped
black.....................................................(no files to check)Skipped
flake8....................................................(no files to check)Skipped
mypy......................................................(no files to check)Skipped
sqlfluff-lint.............................................(no files to check)Skipped
dbt compile...............................................(no files to check)Skipped
dbt docs generate.............................................................Passed
Check column descriptions are same............................................Passed
Check the model columns have description......................................Passed
Check the model has all columns in properties file............................Failed
- hook id: check-model-has-all-columns
- exit code: 1

<< This hook takes about 50 minutes to run and generates errors for dozens of models in the project similar to below >>

Columns in .., but not in Database (....sql):
- name: middle_name
Unable to find model `model....` in catalog file. Make sure you run `dbt docs generate` before executing this hook.
Unable to find model `model....` in catalog file. Make sure you run `dbt docs generate` before executing this hook.
... 
Columns in Database (....sql), but not in ....yml:
- name: some_col
- name: some_other_col
...
Columns in ....yml, but not in Database (...sql):
- name: my_col_1
- name: my_col_2

Check the model has description...............................................Passed
Check the model has properties file...........................................Passed
Check the script does not contain a semicolon.............(no files to check)Skipped
Check the script has not table name.......................(no files to check)Skipped
Check the script has existing refs and sources............(no files to check)Skipped
Check for source column descriptions..........................................Passed
Check the source has all columns in the properties file.......................Passed
Check the source table has description........................................Passed
Check the macro has description...............................................Passed

I may be doing something wrong here but I can't see what it might be. It definitely looks like these hooks are running for all models in the dbt project, not just the two files I am actually committing.

from dbt-checkpoint.

nathangriffiths-cdx avatar nathangriffiths-cdx commented on June 2, 2024 1

The workaround for this e.g. when adding dbt-checkpoint to a new repo is:

  • stage (git add) only the changed yaml files
  • commit the files bypassing pre-commit checks e.g. git commit -m "my message" --no-verify

After this it appears dbt-checkpoint will only include staged files in checks as intended.

from dbt-checkpoint.

nathangriffiths-cdx avatar nathangriffiths-cdx commented on June 2, 2024

I think I just ran into this issue myself, ironically while adding dbt-checkpoint to the project. This required adding or editing these files to the root of the project:

.dbt-checkpoint.yaml
.pre-commit-config.yaml

Which then triggered dbt-checkpoint to run for the entire project on commit, finding a large number of non-compliant files and in theory preventing me from commiting changes to implement dbt-checkpoint in the first place until all the issues are resolved. Not sure if this is intentional or not?

from dbt-checkpoint.

noel avatar noel commented on June 2, 2024

from dbt-checkpoint.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.