Comments (4)
Environment:
dbt-bigquery 1.5.6
Python 3.10.10
Gitbash on Windows 11
Git repo structure:
project_root_folder/dbt_project_folder
dbt_project_folder contains an existing dbt project with a large number of non-compliant files.
Steps
In project_root_folder:
Create the file .dbt-checkpoint.yaml
Update the file .pre-commit-config.yaml
.dbt-checkpoint.yaml
version: 1
dbt-project-dir: dbt_project_folder
.pre-commit-config.yaml
- repo: https://github.com/dbt-checkpoint/dbt-checkpoint
rev: v1.1.1
hooks:
- id: dbt-compile
- id: dbt-docs-generate
- id: check-column-desc-are-same
- id: check-model-columns-have-desc
- id: check-model-has-all-columns
- id: check-model-has-description
- id: check-model-has-properties-file
- id: check-script-semicolon
- id: check-script-has-no-table-name
- id: check-script-ref-and-source
- id: check-source-columns-have-desc
- id: check-source-has-all-columns
- id: check-source-table-has-description
- id: check-macro-has-description
- id: check-macro-arguments-have-desc
$ git status
On branch dbt-checkpoint-implementation
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: .dbt-checkpoint.yaml
modified: .pre-commit-config.yaml
Now commit the changes:
git commit -m "some commit message"
Output:
check yaml....................................................................Passed
check json................................................(no files to check)Skipped
fix end of files..............................................................Passed
trim trailing whitespace......................................................Passed
isort.....................................................(no files to check)Skipped
black.....................................................(no files to check)Skipped
flake8....................................................(no files to check)Skipped
mypy......................................................(no files to check)Skipped
sqlfluff-lint.............................................(no files to check)Skipped
dbt compile...............................................(no files to check)Skipped
dbt docs generate.............................................................Passed
Check column descriptions are same............................................Passed
Check the model columns have description......................................Passed
Check the model has all columns in properties file............................Failed
- hook id: check-model-has-all-columns
- exit code: 1
<< This hook takes about 50 minutes to run and generates errors for dozens of models in the project similar to below >>
Columns in .., but not in Database (....sql):
- name: middle_name
Unable to find model `model....` in catalog file. Make sure you run `dbt docs generate` before executing this hook.
Unable to find model `model....` in catalog file. Make sure you run `dbt docs generate` before executing this hook.
...
Columns in Database (....sql), but not in ....yml:
- name: some_col
- name: some_other_col
...
Columns in ....yml, but not in Database (...sql):
- name: my_col_1
- name: my_col_2
Check the model has description...............................................Passed
Check the model has properties file...........................................Passed
Check the script does not contain a semicolon.............(no files to check)Skipped
Check the script has not table name.......................(no files to check)Skipped
Check the script has existing refs and sources............(no files to check)Skipped
Check for source column descriptions..........................................Passed
Check the source has all columns in the properties file.......................Passed
Check the source table has description........................................Passed
Check the macro has description...............................................Passed
I may be doing something wrong here but I can't see what it might be. It definitely looks like these hooks are running for all models in the dbt project, not just the two files I am actually committing.
from dbt-checkpoint.
The workaround for this e.g. when adding dbt-checkpoint to a new repo is:
- stage (
git add
) only the changed yaml files - commit the files bypassing pre-commit checks e.g.
git commit -m "my message" --no-verify
After this it appears dbt-checkpoint will only include staged files in checks as intended.
from dbt-checkpoint.
I think I just ran into this issue myself, ironically while adding dbt-checkpoint to the project. This required adding or editing these files to the root of the project:
.dbt-checkpoint.yaml
.pre-commit-config.yaml
Which then triggered dbt-checkpoint to run for the entire project on commit, finding a large number of non-compliant files and in theory preventing me from commiting changes to implement dbt-checkpoint in the first place until all the issues are resolved. Not sure if this is intentional or not?
from dbt-checkpoint.
from dbt-checkpoint.
Related Issues (20)
- Add skip functionality to "dbt-deps" hook
- Support for multiple data types for check-column-name-contract hook HOT 1
- `check-column-name-contract` doesn't output the name of the files HOT 1
- check-model-name-contract support for snapshots HOT 1
- Typo in docs for how to use --exclude
- Following the instructions for github actions failed (re: profile.yml)
- unable to exclude specific model from pre-commit
- check_source_has_all_columns reports failure due to use of lower() HOT 1
- Support multiple dbt project roots in a single repo HOT 2
- check-script-ref-and-source erroneously checks refs in comments
- Generate docs for only the staged/changed model(s) HOT 3
- `check-source-has-tests` has the wrong argument documented for test count HOT 1
- check-source-loaded-at-field-is-valid
- check-column-name-contract (and maybe other checkpoints based on Catalog) does not work with Versioned models HOT 1
- dbt-docs-generate speedup possibility with --no-compile? HOT 9
- check-script-ref-and-source hook not accounting for versioned models
- check_model_has_tests_by_name being applied to all models
- `check-model-has-description` raises `IsADirectoryError` intermittently
- generate-model-properties-file hook "When to use it" is wrong
- get_source_schemas function checks yaml files that are located outside of dbt project's directory
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dbt-checkpoint.