yhoogstrate / fuma Goto Github PK
View Code? Open in Web Editor NEW:dash::leopard: FuMa: reporting overlap in RNA-seq detected fusion genes
License: GNU General Public License v3.0
:dash::leopard: FuMa: reporting overlap in RNA-seq detected fusion genes
License: GNU General Public License v3.0
If we have three datasets with one fusion in each dataset, of which for all fusions the left junction is identical and spanning the same gene but the right junction is spanning a different (sub)set:
Genes dataset 1: [Left],[A,B]
Genes dataset 2: [Left],[A,B,C]
Genes dataset 3: [Left],[B,C]
Then the outcome is dependent on the order of comparison:
Order 1:
[A,B] + [A,B,C] → Overlap: [A,B,C]*
[A,B,C]* + [B,C] → Overlap: [A,B,C]**
Order 2:
[A,B] + [B,C] → no overlap*
[no overlap]* + [A,B,C] → no overlap**
We expect this bug to be rare, but it may affect the outcome only by changing order of the samples.
Because of the object oriented structure of the code - i.e. the concatenated datasets are used as novel datasets - it is barely impossible to solve this issue without loosing much (time) performance. It is not planned to solve this bug at the moment.
Hi,
I am trying to run fuma with fusion catcher and i am getting the following error. Could you plea help me to solve this error (my guess is stupid mistake from my side)
Command
/opt/nasapps/development/fuma/3.0.3/bin/fuma -s "fusioncatcher:fusion-catcher_final:final_list_candidate-fusion-genes.txt" "ericscript:ericscript:ES.results.filtered.tsv" -f "list" --strand-specific-matching --acceptor-donor-order-specific-matching -o "fuma.txt" -l "FusionCatcher:hg38" "ericscript:hg38"
Error
Traceback (most recent call last):
File "/opt/nasapps/development/fuma/3.0.3/bin/fuma", line 4, in
import('pkg_resources').run_script('fuma==3.0.3', 'fuma')
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 724, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 1656, in run_script
File "/opt/nasapps/development/fuma/3.0.3/lib/python2.7/site-packages/fuma-3.0.3-py2.7.egg/EGG-INFO/scripts/fuma", line 203, in
Exception: unknown sample: FusionCatcher
Thanks,
Keyur
FuMa crashes with a ZeroDivisionError when the input files contain no fusion events at all, due to the way the progress is calculated relative to the total number of fusion events.
See the example files located in fuma.zip.
transcripts=transcripts.bed
sample=no-variants
fusioncatcher=no-variants.fusioncatcher
star_fusion=no-variants.star-fusion
fuma -a hg38:${transcripts} -s fc-${sample}:fusion-catcher_final:${fusioncatcher} sf-${sample}:star-fusion_final:${star_fusion} -l "fc-${sample}:hg38" "sf-${sample}:hg38" -f extensive -o -
2020-06-11 14:40:01,983 - FuMa::ParseBED - INFO - Parsing BED file: transcripts.bed
2020-06-11 14:40:03,830 - FuMa::Readers::ReadFusionCatcherFinalList - INFO - Parsing file: no-variants.fusioncatcher
2020-06-11 14:40:03,830 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Parsing file: no-variants.star-fusion
2020-06-11 14:40:03,831 - FuMa::Readers::ReadFusionCatcherFinalList - INFO - Duplication removal: fc-no-variants (0 fusions)
2020-06-11 14:40:03,831 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Duplication removal: sf-no-variants (0 fusions)
Left-genes Right-genes Spans large gene (>200000bp) fc-no-variants sf-no-variants
2020-06-11 14:40:03,831 - FuMa::ComparisonTriangle - INFO - Starting 0 comparisons for k=1
Traceback (most recent call last):
File "/data/lumc/devel/fuma/install/bin/fuma", line 4, in <module>
__import__('pkg_resources').run_script('fuma==3.0.6', 'fuma')
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 719, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1511, in run_script
exec(script_code, namespace, namespace)
File "/data/lumc/devel/fuma/install/lib/python2.7/site-packages/fuma-3.0.6-py2.7.egg/EGG-INFO/scripts/fuma", line 232, in <module>
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 88, in overlay_fusions
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 150, in log_progress
ZeroDivisionError: float division by zero
On a system without numpy available, after running python setup.py build
followed by python setup install --user
the fuma binary gets installed in ~/.local/bin, which should not happen.
Dear yhoogstrate,
Thanks for you nice and usefull program. It helps me a lot to compare all results from fusion algorithms.
It would be great to have JAFFA format support. There is a description here: https://github.com/Oshlack/JAFFA/wiki/OutputDescription.
The output file is jaffa_results.csv.
I attach to this message an exemple of output file (result of the example from JAFFA).
Thank you !
Best regards,
Noemie
Dear yhoogstrate,
Thanks for nice program. It was quite easy to install and run it, thanks to prowided wiki page.
It would be great to have SOAPfuse format support. There is description of it here: https://sourceforge.net/p/soapfuse/wiki/Output_Files/
Is there some fuma-specific input format? It could be usefull for some other not implemented programms. Users will convert their results into it and use fuma.
For example, there is programm ericscript. It works only with hg38 and it's very complicated to compare it's results with others (hg19). It could be nice to convert it's output to fuma-specific format and compare with chimerascan or defuse.
Thank you!
Best regards,
Nikita
Hi Youri,
As you suggested in IUC Gitter, I'm creating an issue to request FuMa support for Pizzly input.
Attached is an example of Pizzly output.
Thanks a lot!
Maria
After the last few updates pruning goes incredibly smooth but there is one function causing a huge memory overload:
https://github.com/yhoogstrate/fuma/blob/master/fuma/OverlapComplex.py#L148
This should be done way smarter
for n is 5, it should return/yield:
[[[1, 2], [1, 3], [1, 4], [1, 5], [2, 3], [2, 4], [2, 5], [3, 4], [3, 5], [4, 5]],
[[1, 2, 3], [1, 2, 4], [1, 2, 5], [1, 3, 4], [1, 3, 5], [1, 4, 5], [2, 3, 4], [2, 3, 5], [2, 4, 5], [3, 4, 5]],
[[1, 2, 3, 4], [1, 2, 3, 5], [1, 2, 4, 5], [1, 3, 4, 5], [2, 3, 4, 5]], [[1, 2, 3, 4, 5]]]
Hi @yhoogstrate,
I was wondering if there are any plans on porting this to make it at least Python 3 compatible? I'm planning to use this in a soon-to-be released pipeline, but a portion of the pipeline that uses Python is on Python 3.
If you need help, I can also take a look :).
I am getting the following error when i try to run the tophat fusion pre results. I have filter the tophat fusion (fusions.out) results but dint change the format. (i am also attaching this file here)
tmp1.tophat.fusions.out.txt
Here is the error
2017-01-24 17:48:02,834 - FuMa::Readers::ReadTophatFusionPre - INFO - Parsing file: tmp1
2017-01-24 17:48:02,838 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Parsing file: /is2/projects/CCR-SF/scratch/illumina/Processing/ANALYSIS/DATA/talsaniaks/gene_fusion/star_fusion/GH2489/star-fusion.fusion_candidates.final
2017-01-24 17:48:02,842 - FuMa::ComparisonTriangle - INFO - Starting 11325 comparisons for k=1
Traceback (most recent call last):
File "/opt/nasapps/development/fuma/3.0.3/bin/fuma", line 4, in
import('pkg_resources').run_script('fuma==3.0.3', 'fuma')
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 724, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 1656, in run_script
File "/opt/nasapps/development/fuma/3.0.3/lib/python2.7/site-packages/fuma-3.0.3-py2.7.egg/EGG-INFO/scripts/fuma", line 225, in
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 90, in overlay_fusions
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 217, in export_list_chunked
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 177, in export_list_fg
File "build/bdist.linux-x86_64/egg/fuma/Fusion.py", line 268, in get_annotated_genes_left2
Exception: Requested empty gene list
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.