Git Product home page Git Product logo

rcount's Introduction

Rcount: simple and flexible RNA-Seq read counting

About Rcount

Getting started with Rcount

Note on the hierarchy and priority assignments: Rcount takes the feature information from the third column in the GFF/GTF. This is true for some DBs (e.g. TAIR), but others add this info as a separate field in the last column (entries in column 3 are then very simple, e.g. just gene, transcript, exon, UTR). GTFs from ensembl are an example for the latter case. To specify the priorities for the detailed feature types, the GTF needs to be reformatted. I may have a look into this once.

64 bit binaries are available here:

Linux (ubuntu-like); IMPORTANT: install the Qt5 libraries: sudo apt-get install qt5-default

Windows7

Windows Server 2012 R2

MacOSX

Additional data for the tutorial and reference annotations are available here:

Data for the rice tutorial

Test data annotations

Test data results

NOTE: On linux, GLIBC needs to be at least version 2.14. Biolinux6 has a lower version.

RECENT UPDATE CONCERNING MACOSX BUILD INSTRUCTIONS (advanced users only):

A.T. managed to simplify the build procedure on OSX to quite some extent. He will host code specifically adapted for this on his GitHub account:

MacOSX build from source, by A.T.

We are working on the build instructions. Thanks a lot for your work A.

Command-line version - Update 2016-03-25:

I merged the command line version and the GUI version (on all platforms, it was a prerequisite for a workflow I am working on at the moment).

Usage notes:

  1. For Rcount-multireads, type:

./Rcount-multireads -c infile,outfile,doReweight,allocationDistance

  • infile must be a sorted bam file, outfile is bam.
  • doReweight should be either y or n
  • allocationDistance should be a number (e.g. 100)

Example:

./Rcount-multireads -c mySample.bam,mySampleReweighted.bam,y,100

  1. For Rcount-distribute, type:

./Rcount-distribute -c [list of comma-separated project files]

Examples:

./Rcount-distribute -c myFirstProject.xml

./Rcount-distribute -c myFirstProject.xml,mySecondProject.xml

Note: the project files are the ones normally created by the GUI-version of Rcount ("create project"). I added a Python script which can be used to generate project files via the command-line.

API

Here just a brief example if you would like to use the data base with the annotation where you can query read mappings or positions:

Include the following files:

#include "../p502_SOURCE/dataStructure/databaseitem.h"
#include "../p502_SOURCE/dataStructure/database.h"
#include "../p502_SOURCE/dataStructure/mappingtreeitem.h"
#include "../bamHandler/bamhelpers.h"

then later on in the code - initialize and load the data base:

QString annofile = "/path/to/annotation.xml";
QVector<QVariant> headers;
headers << "Sname" << "Schrom" << "Sstrand" << "Ustart" << "Uend" << "Sfeature" << "SassembledFeature" << "Upriority";
database anno(headers);
anno.print_time("START");
if ( anno.readData(annofile) ) { anno.print_time("annotation loaded"); }

finally, you can query intervals or positions - there are multiple functions - check them the database header file:

"../p502_SOURCE/dataStructure/database.h"

for an example how to use it, check the function readMapper::run() in:

"../p502_SOURCE/dataAnalysis/readmapper.cpp"

here an example to map a simple position (chrom + pos):

QString chrom = "Chr1";
uint pos = 11351183;
QVector<databaseItem*> mapping = anno.bestRmapPosition(chrom, pos); // note that there are also functions which fill in pre-allocated vectors - if you like to avoid the return-by-value
foreach (databaseItem* element, mapping) {
  std::cerr << element->data(0).toString().toStdString() << std::endl << std::flush;
}

(if you have specific questions, contact me)

rcount's People

Contributors

mwschmid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rcount's Issues

ERROR: ./Rcount-multireads: error while loading shared libraries: libQt5Widgets.so.5: cannot open shared object file: No such file or directory

Hi,
Trying to use your tool in a conda environment on a Linux server. Running ./Rcount-multireads results in the following error eventhough qt5 is installed:

./Rcount-multireads: error while loading shared libraries: libQt5Widgets.so.5: cannot open shared object file: No such file or directory

This is probably trivial, but how would I grant access to the qt5 libraries, which are located under the following path /software/miniconda3/envs/repeats/lib ?

multireads option

Hi,

I've been been having trouble getting Rcount-distribute to run Rcount-multireads output:
I took a bam file, and ran it through Rcount-multireads.
I then took the weighted result and tried running it through Rcount-distribute with the "Use multiple alignments" option checked.

It gave me the error:
"/usr/local/include/seqan/bam_io/read_bam.h:201 Assertion failed : remainingBytes > 4 was: 1 <= 4
Aborted (core dumped)"

Do you have any idea what went wrong?

Attached below is a portion of the bam file (it exhibits the same error behavior) (tiny.bam),
the Rcount-multireads weighted bam file (tiny_multi_weighted.bam),
and the project settings file (temp.xml).

Many thanks,
Ming

tiny.bam.gz
temp.xml.gz
tiny_multi_weighted.bam.gz

Command Line version

@MWSchmid

I want to incorporate the Rcount into one of my CLI application, do you have a minimal command line version without GUI, I don't want to add the QT dependencies. I want something that accept "bam" and give me a proportional count.

Thanks
demis001

Rcount-format with bed file

Hi!

I tried using your Rcount-format with a bed file downloaded from UCSC.

Although the bed file had multiple chromosomes in it, it looks like your Rcount-format only processed one chromosome and then finished.

Also, when I limited the bed file to chr10, only 972 of the 1600 unique NM_ / NR_ numbers ended up on in the output xml file.

Do you have any idea why this happened?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.