Git Product home page Git Product logo

pipeline's Introduction

codecov

Cellenics Pipeline

The Cellenics pipeline project for dependency-managed work processing.

Getting started

The steps of the pipeline that are run through this project are started spontaneously on your machine as Docker containers, simulating Kubernetes in local development.

We have included a utility so you can automatically monitor containers spawned and read their logs as they are executing.

For local development, you should already have Docker and Node.js installed, as well as Inframock running.

Afterwards, you can install the pipeline dependencies with:

make install

To build and run the pipeline:

make build && make run

A similar message should appear:

> node src/app.js

Loading CloudFormation for local container launcher...
Creating mock Lambda function on InfraMock...
No previous stack found on InfraMock.
Stack with ARN arn:aws:cloudformation:eu-west-1:000000000000:stack/local-container-launcher/106d1df9 successfully created.
Waiting for Docker events...

Logs from pipelines run through the API will apear here.

Rebuiling the docker images

make build

Local development and adding dependencies

First make sure the project library is synchronized with the lockfile:

# inside pipeline-runner folder
renv::restore()

NOTE: To restore Bioconductor packages your R version needs to be the same as in the Dockerfile (4.0.5).

install.packages(...) and use them (e.g. dplyr::left_join(...)) as you normally would. Then, update the lockfile:

renv::snapshot()

commit the changes to the lockfile (used to install dependencies in the Dockerfile). See renv docs for more info.

Debugging locally

TLDR: save something inside /debug in a data processing or gem2s step to access it later from ./local-runner/debug.

TLDR2: if the pipeline throws an error, tryCatchLog will save a dump file in ./local-runner/debug that can be used for inspecting the workspace and object values along the call stack.

To save the parameters (config, seurat_obj, etc) to a data processing task function, specify DEBUG_STEP. Available tasks include all task names listed in run_processing_step init.R as well as DEBUG_STEP=all to save the parameters to all data processing task functions:

# e.g. DEBUG_STEP=dataIntegration
DEBUG_STEP=task_name make run

When the pipeline is run, it will save the parameters to the specified task_name in $(pwd)/debug. You can load these into your R environment:

# clicking the file in RStudio does this for you
load('{task_name}_{sample_id}.RData')

# if you need to load multiple tasks, you can load each into a seperate environment
# you would when access objects using e.g. task_env$scdata
task_env <- new.env()
load('{task_name}_{sample_id}.RData', envir = task_env)

pipeline's People

Contributors

alexvpickering avatar gerbeldo avatar ogibson avatar kafkasl avatar marcellp avatar cosa65 avatar stefanbabukov avatar aerlaut avatar juanluonieva-biomage avatar robioson avatar ivababukova avatar xverges avatar seb-mueller avatar dependabot[bot] avatar alexanderdesmond avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.