zerodel / sailfish-cir Goto Github PK
View Code? Open in Web Editor NEWa pipeline for quantification of circular RNA.
a pipeline for quantification of circular RNA.
Hi There,
I got following issues when I was trying to run the program:
fasta=/data1/workspace/DCI/Kirsch/Jianguo.Huang/circRNA/bwaindex/genome.fa
gtf=/data1/Annotation/iGenome/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf
datdir=/data1/workspace/DCI/Kastan/Alternative_splicing/DRA005768/Data/fastq outdir=/data1/workspace/DCI/Kirsch/Jianguo.Huang/circRNA/Results/japan
fastq1=${datdir}/DRR091835_1.fastq.gz
fastq2=${datdir}/DRR091835_2.fastq.gz
PY=/opt/Anaconda3-4.4.0/bin/python3
PG=/opt/NGS/sailfish-cir/sailfish-cir-0.11/sailfish_cir.py
$PY $PG \
-g $fasta \
-a $gtf \
-1 $fastq1 \
-2 $fastq2 \
--bed $output/DRR091835_ciri2.bed \
--libtype ISR \
-o $output
Traceback (most recent call last):
File "/opt/NGS/sailfish-cir/sailfish-cir-0.11/sailfish_cir.py", line 1123, in <module>
work_on_it.process_the_pipe_line()
File "/opt/NGS/sailfish-cir/sailfish-cir-0.11/sailfish_cir.py", line 1004, in process_the_pipe_line
self._cicular_pipeline()
File "/opt/NGS/sailfish-cir/sailfish-cir-0.11/sailfish_cir.py", line 1058, in _cicular_pipeline
do_make_gtf_for_circular_prediction(get_gff_database(self._annotation_file),
File "/opt/NGS/sailfish-cir/sailfish-cir-0.11/sailfish_cir.py", line 706, in get_gff_database
db = gffutils.create_db(gtf_file, db_file_path)
File "/opt/Anaconda3-4.4.0/lib/python3.6/site-packages/gffutils/create.py", line 1286, in create_db
c = cls(**kwargs)
File "/opt/Anaconda3-4.4.0/lib/python3.6/site-packages/gffutils/create.py", line 696, in __init__
super(_GTFDBCreator, self).__init__(*args, **kwargs)
File "/opt/Anaconda3-4.4.0/lib/python3.6/site-packages/gffutils/create.py", line 107, in __init__
conn = sqlite3.connect(dbfn)
sqlite3.OperationalError: unable to open database file
Any suggestions of the causes? Thanks
Hi
I am running sailfish-cir using output generated by CIRI2. I noticed that some of the circular RNA transcripts which have been quantified by the tool have multiple values. What does this signify? Is there any meaning to this?
Eg:
Name Length EffectiveLength TPM NumReads
10:116517317|116549175 581 394.416 2.27141 18.9804
10:116517317|116549175 185 42.0209 5.61631 5
Hello,
I was using sailfish-cir to quantify the expression of circular RNA in tumor samples, that's when I encounter some problems which make me confused about the results.
1, I found that using different GTF files as inputs will produce different results (quant.sf), in which the same candidate circRNA fragments have different expression values. So is there any recommendation of GTF files that would best suited for Sailfish-cir analysis? And what could bring these differences?
2, When I looked into the above problem, I found that there were overlapped transcript_ids in provided hg19.gtf, which is different in the gtf files provided in NCBI. could this be the reason of the inconsistent results?
3, Most candidate circRNAs are named by transcript_id in quant.sf, but, there are also some circRNAs named by gene names, and some genes were supposed to have transcript id in provided gtf and the length of these genes seemed inconsistent with gtf files, what could cause this problem?
Thanks for replying!
Hello,
Sailfish-cir is an useful tool to quantify expression level of circRNA.
Further, I want to get differential expression circRNA between two groups.
Any suggestion for this analysis?
Could you output the read count of circRNA for input of edgeR?
Thanks!
can you plz add the support for gz input of the reads file.
Hi!
I have two questions:
There is no explenation what exacly mean column names in quant.sf file. Do you have such information?
When i try to count expected number of reads arising from specific transcripts using your approach there is problem that some rows (transcripts) are dubbled and expeted value for them differs. Can you help me with this issue?
Regards!
Katarzyna Kozłowska
Hi,
I used paired end datasets in bwa-mem, CIRI and finally to sailfish. I didn't experienced any problem generating data before, but for some reason, I got problems when handling a particular dataset.
This dataset is an rRNA depleted sample.
This is the error that I received:
[2018-05-15 16:16:23.264] [jointLog] [warning] Sailfish saw fewer then 10000 uniquely mapped reads so 200 will be used as the mean fragment length and 80 as the standard deviation for effective length correction
[2018-05-15 16:16:23.265] [jointLog] [info] Estimating effective lengths
[2018-05-15 16:16:23.281] [jointLog] [info] Computed 0 rich equivalence classes for further processing
[2018-05-15 16:16:23.281] [jointLog] [info] Counted 0 total reads in the equivalence classes
[2018-05-15 16:16:23.282] [jointLog] [info] Starting optimizer:
[2018-05-15 16:16:23.287] [jointLog] [info] Optimizing over 0 equivalence classes
[2018-05-15 16:16:23.287] [jointLog] [error] It seems that no transcripts are expressed; something is likely wrong!
[2018-05-15 16:16:23.289] [jointLog] [error] Encountered error during optimization.
This should not happen.
Please file a bug report on GitHub.
Could I get some help for this? Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.