Git Product home page Git Product logo

fuma's People

Contributors

redmar-van-den-berg avatar yhoogstrate avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fuma's Issues

Complex subset problem with n > 2 datasets

If we have three datasets with one fusion in each dataset, of which for all fusions the left junction is identical and spanning the same gene but the right junction is spanning a different (sub)set:

Genes dataset 1: [Left],[A,B]
Genes dataset 2: [Left],[A,B,C]
Genes dataset 3: [Left],[B,C]

Then the outcome is dependent on the order of comparison:

Order 1:
[A,B] + [A,B,C] → Overlap: [A,B,C]*
[A,B,C]* + [B,C] → Overlap: [A,B,C]**

Order 2:
[A,B] + [B,C] → no overlap*
[no overlap]* + [A,B,C] → no overlap**

We expect this bug to be rare, but it may affect the outcome only by changing order of the samples.
Because of the object oriented structure of the code - i.e. the concatenated datasets are used as novel datasets - it is barely impossible to solve this issue without loosing much (time) performance. It is not planned to solve this bug at the moment.

FusionCatcher Error

Hi,

I am trying to run fuma with fusion catcher and i am getting the following error. Could you plea help me to solve this error (my guess is stupid mistake from my side)

Command

/opt/nasapps/development/fuma/3.0.3/bin/fuma -s "fusioncatcher:fusion-catcher_final:final_list_candidate-fusion-genes.txt" "ericscript:ericscript:ES.results.filtered.tsv" -f "list" --strand-specific-matching --acceptor-donor-order-specific-matching -o "fuma.txt" -l "FusionCatcher:hg38" "ericscript:hg38"

Error

Traceback (most recent call last):
File "/opt/nasapps/development/fuma/3.0.3/bin/fuma", line 4, in
import('pkg_resources').run_script('fuma==3.0.3', 'fuma')
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 724, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 1656, in run_script
File "/opt/nasapps/development/fuma/3.0.3/lib/python2.7/site-packages/fuma-3.0.3-py2.7.egg/EGG-INFO/scripts/fuma", line 203, in

Exception: unknown sample: FusionCatcher

Thanks,
Keyur

FuMa crashes on empty fusion file

FuMa crashes with a ZeroDivisionError when the input files contain no fusion events at all, due to the way the progress is calculated relative to the total number of fusion events.
See the example files located in fuma.zip.

transcripts=transcripts.bed
sample=no-variants
fusioncatcher=no-variants.fusioncatcher
star_fusion=no-variants.star-fusion

fuma -a hg38:${transcripts} -s fc-${sample}:fusion-catcher_final:${fusioncatcher} sf-${sample}:star-fusion_final:${star_fusion} -l "fc-${sample}:hg38" "sf-${sample}:hg38" -f extensive -o -

2020-06-11 14:40:01,983 - FuMa::ParseBED - INFO - Parsing BED file: transcripts.bed
2020-06-11 14:40:03,830 - FuMa::Readers::ReadFusionCatcherFinalList - INFO - Parsing file: no-variants.fusioncatcher
2020-06-11 14:40:03,830 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Parsing file: no-variants.star-fusion
2020-06-11 14:40:03,831 - FuMa::Readers::ReadFusionCatcherFinalList - INFO - Duplication removal: fc-no-variants (0 fusions)
2020-06-11 14:40:03,831 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Duplication removal: sf-no-variants (0 fusions)
Left-genes	Right-genes	Spans large gene (>200000bp)	fc-no-variants	sf-no-variants
2020-06-11 14:40:03,831 - FuMa::ComparisonTriangle - INFO - Starting 0 comparisons for k=1
Traceback (most recent call last):
  File "/data/lumc/devel/fuma/install/bin/fuma", line 4, in <module>
    __import__('pkg_resources').run_script('fuma==3.0.6', 'fuma')
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 719, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1511, in run_script
    exec(script_code, namespace, namespace)
  File "/data/lumc/devel/fuma/install/lib/python2.7/site-packages/fuma-3.0.6-py2.7.egg/EGG-INFO/scripts/fuma", line 232, in <module>
    
  File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 88, in overlay_fusions
  File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 150, in log_progress
ZeroDivisionError: float division by zero

Soapfuse format support

Dear yhoogstrate,
Thanks for nice program. It was quite easy to install and run it, thanks to prowided wiki page.
It would be great to have SOAPfuse format support. There is description of it here: https://sourceforge.net/p/soapfuse/wiki/Output_Files/

Is there some fuma-specific input format? It could be usefull for some other not implemented programms. Users will convert their results into it and use fuma.
For example, there is programm ericscript. It works only with hg38 and it's very complicated to compare it's results with others (hg19). It could be nice to convert it's output to fuma-specific format and compare with chimerascan or defuse.

Thank you!

Best regards,
Nikita

Pizzly support

Hi Youri,

As you suggested in IUC Gitter, I'm creating an issue to request FuMa support for Pizzly input.

Attached is an example of Pizzly output.

Thanks a lot!

Maria

out.fusions.tab.txt

memory overload

After the last few updates pruning goes incredibly smooth but there is one function causing a huge memory overload:

https://github.com/yhoogstrate/fuma/blob/master/fuma/OverlapComplex.py#L148

This should be done way smarter

for n is 5, it should return/yield:

[[[1, 2], [1, 3], [1, 4], [1, 5], [2, 3], [2, 4], [2, 5], [3, 4], [3, 5], [4, 5]],
[[1, 2, 3], [1, 2, 4], [1, 2, 5], [1, 3, 4], [1, 3, 5], [1, 4, 5], [2, 3, 4], [2, 3, 5], [2, 4, 5], [3, 4, 5]],
[[1, 2, 3, 4], [1, 2, 3, 5], [1, 2, 4, 5], [1, 3, 4, 5], [2, 3, 4, 5]], [[1, 2, 3, 4, 5]]]

Porting to Python 3

Hi @yhoogstrate,

I was wondering if there are any plans on porting this to make it at least Python 3 compatible? I'm planning to use this in a soon-to-be released pipeline, but a portion of the pipeline that uses Python is on Python 3.

If you need help, I can also take a look :).

Problem running the Tophat Fusion Pre (fusions.out) results

I am getting the following error when i try to run the tophat fusion pre results. I have filter the tophat fusion (fusions.out) results but dint change the format. (i am also attaching this file here)
tmp1.tophat.fusions.out.txt

Here is the error

2017-01-24 17:48:02,834 - FuMa::Readers::ReadTophatFusionPre - INFO - Parsing file: tmp1
2017-01-24 17:48:02,838 - FuMa::Readers::ReadRNASTARFusionFinal - INFO - Parsing file: /is2/projects/CCR-SF/scratch/illumina/Processing/ANALYSIS/DATA/talsaniaks/gene_fusion/star_fusion/GH2489/star-fusion.fusion_candidates.final
2017-01-24 17:48:02,842 - FuMa::ComparisonTriangle - INFO - Starting 11325 comparisons for k=1
Traceback (most recent call last):
File "/opt/nasapps/development/fuma/3.0.3/bin/fuma", line 4, in
import('pkg_resources').run_script('fuma==3.0.3', 'fuma')
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 724, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources/init.py", line 1656, in run_script
File "/opt/nasapps/development/fuma/3.0.3/lib/python2.7/site-packages/fuma-3.0.3-py2.7.egg/EGG-INFO/scripts/fuma", line 225, in

File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 90, in overlay_fusions
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 217, in export_list_chunked
File "build/bdist.linux-x86_64/egg/fuma/ComparisonTriangle.py", line 177, in export_list_fg
File "build/bdist.linux-x86_64/egg/fuma/Fusion.py", line 268, in get_annotated_genes_left2
Exception: Requested empty gene list

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.