For further developers or Data Scientist
- Is better if you work in Gitpod, its easily
- Run
pipenv install
You will need to install the dependencies of the Pipfile.lock to make this project work.
How use this project
- Clone into your computer (or gitpod).
- Add your transformations into the
./transformations/<pipeline>/
folder. - Configure the project.yml to specify the piplines and transformations in the order you want to execute them.
- Add new transofrmation files as you need them, make sure to include
expected_input
andexpected_output
as examples. - Update your project.yml file as needed to change the order of the transformations.
- Validate your transformations running
$ pipenv run validate
. - Run your pipline by running
$ pipenv run pipeline <pipeline_slug> <dataset_name>
- If you need to clean your outputs :
$ pipenv run clear