This repo is an example of a simple CD model that can be used for many use cases in the Data space.
Analytics Engineering has traits of Software engineering but some of the common patterns in software development tend to add too much rigidity and complexity to a developers flow when dealing with the data domain.
This repo is an example of a design that seems to work really well in data and has introduced some of the safe guards we get in Software Engineering like Git Tags.
Most software applications have a develop branch and release branches.
Although this is sometimes useful for much larger projects and teams in the Data world I have noticed that a
simple feature branch from main
is more than enough for most use cases.
- Versioning is simplified to an incremental build number which is then converted to a tag. e.g. v1,v2,v3
- There is a
latest
tag that always points to the latest version. (This is critical as it allows the DBT environment to always run the latest code without modification).
The reason why we go for tags is think of them as snapshots of the code that cannot be changed accidentally. A normal branch can still be manipulated and we want to protect our production jobs as much as we can.
python -m venv .venv
pip install -r requirements.txt
dbt-<adaptor> init
Go to your DbtCloud projects pre-production environment.
Go to your DbtCloud projects production environment.