Git Product home page Git Product logo

bh20-seq-resource's Introduction

Sequence uploader

This repository provides a sequence uploader for the COVID-19 Virtual Biohackathon's Public Sequence Resource project. You can use it to upload the genomes of SARS-CoV-2 samples to make them publicly and freely available to other researchers.

To get started, first install the uploader, and use the bh20-seq-uploader command to uplaod your data.

Installation

There are several ways to install the uploader. The most portable is with a virtualenv.

Installation with virtualenv

  1. Prepare your system. You need to make sure you have Python, and the ability to install modules such as pycurl and pyopenssl. On Ubuntu 18.04, you can run:
sudo apt update
sudo apt install -y virtualenv git libcurl4-openssl-dev build-essential python3-dev libssl-dev
  1. Create and enter your virtualenv. Go to some memorable directory and make and enter a virtualenv:
virtualenv --python python3 venv
. venv/bin/activate

Note that you will need to repeat the . venv/bin/activate step from this directory to enter your virtualenv whenever you want to use the installed tool.

  1. Install the tool. Once in your virtualenv, install this project:
pip3 install git+https://github.com/arvados/bh20-seq-resource.git@master
  1. Test the tool. Try running:
bh20-seq-uploader --help

It should print some instructions about how to use the uploader.

Make sure you are in your virtualenv whenever you run the tool! If you ever can't run the tool, and your prompt doesn't say (venv), try going to the directory where you put the virtualenv and running . venv/bin/activate. It only works for the current terminal window; you will need to run it again if you open a new terminal.

Installation with pip3 --user

If you don't want to have to enter a virtualenv every time you use the uploader, you can use the --user feature of pip3 to install the tool for your user.

  1. Prepare your system. Just as for the virtualenv method, you need to install some dependencies. On Ubuntu 18.04, you can run:
sudo apt update
sudo apt install -y virtualenv git libcurl4-openssl-dev build-essential python3-dev libssl-dev
  1. Install the tool. You can run:
pip3 install --user git+https://github.com/arvados/bh20-seq-resource.git@master
  1. Make sure the tool is on your PATH. THe pip3 command will install the uploader in .local/bin inside your home directory. Your shell may not know to look for commands there by default. To fix this for the terminal you currently have open, run:
export PATH=$PATH:$HOME/.local/bin

To make this change permanent, assuming your shell is Bash, run:

echo 'export PATH=$PATH:$HOME/.local/bin' >>~/.bashrc
  1. Test the tool. Try running:
bh20-seq-uploader --help

It should print some instructions about how to use the uploader.

Installation from Source for Development

If you plan to contribute to the project, you may want to install an editable copy from source. With this method, changes to the source code are automatically reflected in the installed copy of the tool.

  1. Prepare your system. On Ubuntu 18.04, you can run:
sudo apt update
sudo apt install -y virtualenv git libcurl4-openssl-dev build-essential python3-dev libssl-dev
  1. Clone and enter the repository. You can run:
git clone https://github.com/arvados/bh20-seq-resource.git
cd bh20-seq-resource
  1. Create and enter a virtualenv. Go to some memorable directory and make and enter a virtualenv:
virtualenv --python python3 venv
. venv/bin/activate

Note that you will need to repeat the . venv/bin/activate step from this directory to enter your virtualenv whenever you want to use the installed tool.

  1. Install the checked-out repository in editable mode. Once in your virtualenv, install with this special pip command:
pip3 install -e .
  1. Test the tool. Try running:
bh20-seq-uploader --help

It should print some instructions about how to use the uploader.

Installation with GNU Guix

For running/developing the uploader with GNU Guix see INSTALL.md

Usage

Run the uploader with a FASTA file and accompanying metadata file in JSON-LD format:

bh20-seq-uploader example/sequence.fasta example/metadata.json

Workflow for Generating a Pangenome

All these uploaded sequences are being fed into a workflow to generate a pangenome for the virus. You can replicate this workflow yourself.

An example is to get your SARS-CoV-2 sequences from GenBank in seqs.fa, and then run a series of commands

minimap2 -cx asm20 -X seqs.fa seqs.fa >seqs.paf
seqwish -s seqs.fa -p seqs.paf -g seqs.gfa
odgi build -g seqs.gfa -s -o seqs.odgi
odgi viz -i seqs.odgi -o seqs.png -x 4000 -y 500 -R -P 5

Here we convert such a pipeline into the Common Workflow Language (CWL) and sources can be found here.

For more information on building pangenome models, see this wiki page.

bh20-seq-resource's People

Contributors

tetron avatar pjotrp avatar lltommy avatar adamnovak avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.