Your one-stop shop for managing Biomage infrastructure. This is a Python CLI application you can use to manage common tasks related to Biomage infrastructure.
After cloning the repository, do the following:
make install
If you are going to be developing biomage-utils
please install also the development dependencies with:
make develop
You can verify that the command has been successfully installed with:
make test
If the test was successful, you should be able to access biomage-utils
by typing:
biomage --help
As a prerequisite for running all scripts in this repo, you will need a GitHub Personal Access Token with full access to your account. This token should be given ALL scopes available. You can generate one here. Make sure you note down this and supply it when required. Utilities can accept this token in two ways:
- either by having it available as the environment variable
GITHUB_API_TOKEN
- or by passing it as an option with the
-t
flag.
For example:
GITHUB_API_TOKEN=mytoken biomage stage
or
biomage stage -t mytoken
Using the environment variable means you can put the token in your
.bashrc
or .zshrc
file, thereby avoiding typing it again and again. You can
then simply do:
biomage stage
-
BIOMAGE_NICK
is optional and used to override theUSER
environment variable as the first part of the name of the staging environments created by you:${BIOMAGE_NICK:-${USER}}-...
. -
BIOMAGE_EMAIL
is used by theexperiment pull/copy
command to automatically add permissions for the downloaded experiment. You should specify the email used to log in into the platform in staging. Note your current AWS account will need permissions for Cognito. -
BIOMAGE_DATA_PATH
: where to get the experiment data to populate inframock's S3 and DynamoDB. It is recommended to place it outside any other repositories to avoid interactions with git. For example,export BIOMAGE_DATA_PATH=$HOME/biomage-data
(or next to where your biomage repos live). If this is not set, it will default to./data
. Note: this should be permanently added to your environment (e.g. in.zshrc
,.localrc
or similar) because other services likeinframock
orworker
rely on using the same path. -
COGNITO_PRODUCTION_POOL
andCOGNITO_STAGING_POOL
: The Cognito pool ids used for user account administration. It is recommended to set this interactively. For example, runexport COGNITO_PRODUCTION_POOL=eu-west-1_BLAH
before runningbiomage account ...
.
Configures a repository using best practices. You can supply a repository name as in:
biomage configure-repo ui
The script will ensure the repository is configured according to the current best practices for the repository. You can see more details about the configuration in configure-repo/configure_repo.py script.
Rotates the AWS access keys used by the CI runners. You must have the required AWS rights to use this utility. Credentials will be fetched in the same way as they are for the AWS CLI.
You can run:
biomage rotate-ci
and the script should take care of the rest.
Deploys a staging environment. This utility takes a list of deployments as arguments. A deployment can be one of the following:
- A repository name that publishes a staging candidate file to the
iac
repo, e.g.ui
. In this case, the manifest fetched is the one for themaster
branch of theui
repository. - A repository name and a pull request ID, e.g.
ui/12
. In this case, the manifest fetched is the one for the pull request 12 branch of theui
repository.
The default deployments for all stage
commands is ui
, api
, worker
. If you wish to
deploy a different version of these, you can specify that manually. Then, at the bare minimum,
you can run:
biomage stage
If you wish to test changes to you made to the API available under pull request 25, you can run:
biomage stage api/25
This will compose a sandbox comprising api
as found under pull request 25
, as well as ui
and worker
as found under master
.
The utility will launch an interactive wizard to guide you through creating your environment.
The option to create isolated staging environments is provided during the creation of the staging environment. Creating a new isolated staging environment allows you to modify database records and files without causing changes to other staging environments. This also isolates your staging environment from changes made by others.
Isolated staging environments are created by creating new experimentIds using data from existing experiments. Data and records are copied from source tables and buckets independently of deployment. Therefore, in the event of deployment failure, data and records may be copied successfully.
If your deployment fails, you are recommended to use the same sandbox Id that have failed. The wizard will detect existing staging environemtns and display created experimentIds. If you decide to use a different sandbox ID, unstage your environment before staging using a new sandbox ID.
During the creation of isolated staging environments, the following files are copied in these S3 buckets into their staging counterpart:
biomage-source-production
processed-matrix-production
Records in the following DynamoDB tables are copied into their staging counterparts:
experiments-production
samples-production
These configurations are read from config.yaml
Pinning is a feature of the utility. When you pin a deployment, you ensure that no changes made
to the manifest file after the fact will effect the sandbox you are creating. For example, if you
want to make sure your ui
feature will not be affected by changes to pushes to master in the api
repository after you create the sandbox, you have the option to pin api
to prevent this from happening.
The default behavior is to pin all deployments that source from master
. This ensures that the sandbox
is immutable, i.e. anything not currently being tested in the sandbox will not change unpredictably
after it's deployed. Deployments that source from a pull request branch are by default not pinned,
as these are commonly the branches that a developer would push features to mid-development.
Removes a staging environment. You must specify the sandbox ID of the staging environment deployed previously from here. Then, run
biomage unstage my-sandbox-id
to remove your deployment and delete staged environment.
Manages experiment's data and configuration. See biomage experiment --help
for more details.
Downloads experiment data and config files from a given environment into BIOMAGE_DATA_PATH/experiment_id/
. See biomage experiment pull --help
for more information, parameters and default values.
E.g. to download experiment e52b39624588791a7889e39c617f669e
data from production
:
biomage experiment pull production e52b39624588791a7889e39c617f669e
List available experiments data in a given environment. See biomage experiment ls --help
for more information, parameters and default values.
Example: list experiment files in production
:
biomage experiment ls production
Copies one experiment information from a given env to another (default: production to staging), prefxing keys of the copied data and records with sandbox_id
. It is recommended to set the value of sandbox_id
to the same value as your staging environment, so that running biomage unstage
when unstaging the experiment will also delete records of the copied experiment.
To run this command, BIOMAGE_EMAIL
has to be set with the email you use to login in the staging environment. See biomage experiment copy --help
for more details.
biomage experiment copy --experiment_id=my-experiment-id --sandbox_id=my-sandbox-id
Compares experiment settings accros development/staging/production environments. Note it needs inframock running in order to work.
biomage experiment compare my-experiment-id
Download files associated with an experiment.
biomage experiment download -e my-experiment-id -i environment
Currently download of the following files is supported:
- Sample files
- Raw RDS file
- Processed RDS file
- Cell sets file
Note this command needs biomage rds tunnel
running in another tab to work.
A set of helper commands to aid with managing Cellenics account information (creating user accounts, changing passwords). See biomage account --help
for more information, parameters and default values. Needs environmental variables COGNITO_PRODUCTION_POOL
and/or COGNITO_STAGING_POOL
.
Includes many rds connection-related mechanisms. See biomage rds --help
for more details.
In order to run these you will need the following tools installed:
jq
brew install jq
psql
brew install postgresql
curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/mac/sessionmanager-bundle.zip" -o "sessionmanager-bundle.zip"
unzip sessionmanager-bundle.zip
sudo /usr/local/bin/python3.6 sessionmanager-bundle/install -i /usr/local/sessionmanagerplugin -b /usr/local/bin/session-manager-plugin
Sets up an ssh tunneling/port forwarding session for the rds server in a given environment.
Example: set up an ssh tunnel to one of the staging rds endpoints biomage rds tunnel -i staging
Run a command in the database cluster using IAM if necessary.
Important: If you're not trying to connect in development, you'll need to run biomage rds tunnel -i staging
first in a different terminal.
Example: login into postgre console in staging biomage rds run psql
Example: dump the in staging biomage rds run psql
See biomage rds run --help
for more details.