This repo contains a set of templates and CLI commands that you can use to manage GoodData.CN workspaces in a declarative way. Here is when you might want to use it:
- Version Control System (e.g. Git) integration for versioning, collaboration, CI/CD etc.
- Moving metadata between different environments.
definitions
folder contains an example definitions that you can use as a starting point. It's based on the demo project that can be automatically added to GoodData Cloud instance.scripts
folder contains a set of.http
and.js
files to handle metadata sync betweendefinitions
and GoodData server.import.http
contains a set of HTTP request templates for importing declarative definitions from GoodData server.create.http
is focusing on pushing thedefinitions
to a fresh GoodData server instance.update.http
is focusing on pushing a new version of thedefinitions
to an existing GoodData server instance.toJson.js
is a helper script that converts YAML files to JSON before it's being uploaded to GoodData server. Typically, you would not need to update this file.toYaml.js
is a helper script that converts JSON definitions to YAML before saving them todefinitions
folder. Typically, you would not need to update this file.
.github/workflows/cd.yaml
contains an example CD pipeline that updates production server every time there is a new commit to master branch of this repository.
The setup prepared in a way that you can edit and test individual requests in .http
files in your favorite IDE.
For usage in CI/CD pipeline, we also have a set of scripts prepared in package.json
file.
Install NodeJS
on your machine.
In terminal, navigate to the root folder of this repository and run npm i
.
This will install all necessary dependencies for the CLI scripts that we are going to use.
You should be able to edit .http
files in any IDE that supports them natively or has an extension for them. We've tested
the setup in IntelliJ and VSCode.
You don't need to do any special steps for IntelliJ family of IDEs (Idea, WebStorm, PyCharm etc.), as this IDEs
support running .http
files out of box.
For VSCode we recommend installing either httpYac
or httpBook
extensions.
httpYac
has a simple interface, similar to the popular REST Client extension. Unlike REST Client, it has a good
compatibility with IntelliJ-specific syntax, so you can have better cooperation in large teams, where different people
have different IDE preferences. We are using httpYac
CLI tools to run CI/CD scripts as well.
httpBook
is based on the httpYac
and adds an option to view .http
files as Jupyter notebooks. We recommend using
this extension if you're planning to run HTTP commands mostly manually and would like to add reach description to each
command.
We are using IntelliJ way of organizing environment variables:
http-client.env.json
contains public variables that should be saved in Git repository. For example, this file stores a hostname of your GoodData instance.http-client.private.env.json
contains private variable, a.k.a. secrets. The file is added to.gitignore
to prevent committing it by accident.
In the example files we pre-defined two environments for you.
production
environment will store hostname, database credentials etc. of your production server.development
environment is for dev purposes - to test new ideas and develop your analytical solution. It can be either another instance of GoodData server deployed in the cloud, or your local Docker instance.
You can add as many extra environments as you need (e.g. for staging or QA servers), just use existing ones as an example.
Let's fill in all the needed variables according to your setup:
- In
http-client.env.json
, fill in base URL for your production server and development server.http://localhost:3000
for dev server works if you're using local Docker instance for development. - Copy
http-client.private.env.json.template
to a newhttp-client.private.env.json
file.token
is your API Token for the server.demo_ds_*
variables are for database connection. SeedataSource definition
for details. The default example is showcasing Snowflake connection, so you might need to change the definition and variables according to our docs.
You should be all set to run the .http
commands from your IDE. Let's test it by executing some commands
from the import.http file. Don't worry, those are read-only commands and will not update any
metadata on your server or in definitions
folder.
Next you can read how you can configure CI/CD pipeline or dive deeper into different use cases for declarative definitions.
package.json
file contains several pre-defined scripts for your convenience. Each script is available for both prod
and dev
environment.
import-prod
andimport-dev
scripts will executeimport.http
and save the downloaded metadata in thedefinitions
folder. Any files with conflicting names will be overridden.create-prod
andcreate-dev
scripts will executecreate.http
and push all thedefinitions
to the corresponding prod or dev server. This script is meant to be used when deploying on a completely new, blank instance.update-prod
andupdate-dev
scripts will executeupdate.http
and push all thedefinitions
to the corresponding prod or dev server. This script is meant for metadata updates to a newer version. I.e. it expects that assets like Workspace and DataSource are already created.
Ideally, we would want to have a single .http
file for both create
and update
operations (i.e. upsert
), but at the moment there is a limit in how our server's API works.
NOTE.
update-*
andcreate-*
scripts are generating temporaryjson
folder where it stores files converted from YML definitions. JSONs in that folder also have environmental variables already populated. This means that your database credentials will be saved there as well in plain text. Make sure to never store the JSON folder in your VCS and never share it openly.
You can find an example of the CI/CD pipeline in the .github/workflows/cd.yaml
file. The configuration is rather
simple. Every time there a new commit to the master branch, GitHub Actions will execute npm run update-prod
to
push the new changes to the production server.
NOTE. Whenever you're adding a completely new entity to the organization (like, new a data source or a new workspace), you'll need to create that asset on the production server manually because we are using Entities API instead of Layouts API for that. Entities API does not support upsert, so you'll have to explicitly either create or update the entity. We are working on mitigating this limitation in the future releases of the GoodData.CN.
Using our configuration file as an example, you can set up any other pipeline (CirleCI, Bitbucket Pipelines, Jenkins etc.). Few steps to keep in mind:
- Configure your pipeline to be executed on every commit to the main branch.
- Make sure that the environment you're running in supports NodeJS (e.g. by specifying the correct Docker image for your pipeline).
- Checkout repo on the master branch and navigate to the project root folder.
- Install NPM dependencies by executing
npm ci
. - Ensure the
http-client.private.env.json
file is created and populated in the root folder of the project. Make sure to use best practices for storing credentials for your pipeline. For example, in GitHub we are using Secrets to store data source credentials. - Execute
npm run update-prod
to push new changes to the production server.
- Make sure the environment is prepared according to the instruction above.
- Remove the
definitions
folder completely. - Edit all
.http
files and replace the workspace id (demo_ws
) and data source id (demo_ds
) with the actual workspace and data source ids that you want to track. - Run
npm run import-dev
command if you want to import definitions from your development server ornpm run import-prod
if you want to import from production.
The script will create definitions
folder and populate it with corresponding YAML files. Next, you can commit the new definitions to you VCS.
The .http
files are created in a way that you need to explicitly define which workspaces you want to manage with Git workflow. If you want to track more workspaces at once, there are few options:
- Duplicate the parts of the
.http
files that are responsible for the workspace management and update the workspace ID in the copied snippet. New workspaces will be added to a separate folder under thedefinitions
the next time when you runimport
script. - Adjust the
.http
file to load all workspaces at once using the/api/entities/workspaces
feed (see our API Reference). You will also want to edittoJson.js
andtoYaml.js
scripts to split the resulting JSON into separate files, otherwise it might be not scalable depending on how big your workspaces are.
By default, we only include feeds for the user groups management into the .http
files. That's because we expect you
to have a different set of users on you dev, QA and production environment anyway. On top of that, storing user in VCS
is not the best idea, as this is the data that changes rather often in most cases.
However, if you only manage a handful of predefined users and have the same SSO provider on all your environments,
you can edit .http
files to sync users. For examples, in import.http
you can add:
### Import users
# @name users
GET {{base_url}}/api/layout/users
Authorization: Bearer {{token}}
> {% client.assert(response.status >= 200 && response.status < 300, `Request failed with status code ${response.status}`) %}
Similar code snippets would need to be added to create.http
and update.http
files.
A new ./definitions/users.yml
file will be created with a list of all users under your organization.
Given setup will work well for a small to medium projects, but could become unmanageable for large projects with big analytical models (i.e. large number of metrics, insights and dashboards). There are few options how you can overcome this:
- Use more granular Entities API to load analytical model. See our REST API reference.
- Make
toJson.js
andtoYaml
scripts smarter and aware of the type of content they are parsing. E.g. you can define a logic that would split the analytical model and put every dashboard, insight and metric into an individual YAML file. - For implementing complex workflows consider using our Python SDK.