maurijlozano / iscompare Goto Github PK
View Code? Open in Web Editor NEWISCompare, an opensource program to identify Differentially Located Insertion Sequences
License: GNU General Public License v3.0
ISCompare, an opensource program to identify Differentially Located Insertion Sequences
License: GNU General Public License v3.0
Hi Mauricio,
ISCompare seems to be working happily but then fails at the stage of writing the final results. Do you have any suggestions how to fix this?
Thanks,
M
I'm using python/3.9.9 with
Biopython/1.79
DNAfeaturesviewer/3.1.0
Pandas/1.3.4
Numpy/1.21.4
Mechanize/0.4.7
My command and the command line output:
$ ISCompare.py -q R1S1.gbff -r Hlac.gbff -i Hlac_is.fna -o Hlac_v_R1S1_gbff -c -p
***** ISCompare
***** Version: 1.0.1
***** Developed by Mauricio J. Lozano
***** github.com/maurijlozano
Please cite:
Easy identification of insertion sequence mobilization events
in related bacterial strains with ISCompare.
E.G. Mogro, N. Ambrosis, M.J. Lozano
doi: https://doi.org/10.1101/2020.10.16.342287
Instituto de BiotecnologÃa y BiologÃa Molecular
CONICET - CCT La Plata - UNLP - FCEDownloaded from: https://github.com/maurijlozano/ISCompare
Directory Hlac_v_R1S1_gbff created.
Query: ['R1S1.gbff'] Ref: ['Hlac.gbff']
Copying files to Hlac_v_R1S1_gbff folder, and extracting fasta files from genbank.
query.fasta generated.
ref.fasta generated.
133 ISs found on the query genome...
Step1: Removing identical scaffolds: Query --> Reference ...
93 identical scaffolds removed from the analysis...
Step2: Testing for new IS on the query...
123 ISs found on the reference genome...
Step3: Removing identical scaffolds: Reference --> Query ...
0 identical scaffolds removed from the analysis...
Step4: Testing for IS missing on the query...
Step5: Consolidating results and annotating data...
Traceback (most recent call last):
File "/apps/iscompare/1.0.1/iscompare/ISCompare.py", line 2297, in
FinalResults[['ISstart', 'ISend', 'Start1', 'End1', 'REF.Start1','REF.End1','Start2', 'End2', 'REF.Start2','REF.End2']] = FinalResults[['ISstart', 'ISend', 'Start1', 'End1', 'REF.Start1','REF.End1','Start2', 'End2', 'REF.Start2','REF.End2']].astype("Float32")
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/generic.py", line 5808, in astype
results = [
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/generic.py", line 5809, in
self.iloc[:, i].astype(dtype, copy=copy)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/generic.py", line 5815, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 418, in astype
return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 327, in apply
applied = getattr(b, f)(**kwargs)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 591, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1309, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1257, in astype_array
values = astype_nansafe(values, dtype, copy=copy)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1105, in astype_nansafe
return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/arrays/floating.py", line 261, in _from_sequence
values, mask = coerce_to_array(scalars, dtype=dtype, copy=copy)
File "/apps/python/3.9.9/lib/python3.9/site-packages/pandas/core/arrays/floating.py", line 143, in coerce_to_array
raise TypeError(f"{values.dtype} cannot be converted to a FloatingDtype")
TypeError: object cannot be converted to a FloatingDtype
Hi,
I have used ISCompare.py successfully before.
A few months later, I am trying to check the results on an updated file and am facing an error:
Traceback (most recent call last):
File "/Users/mathilde/Documents/GitHub/ISCompare/ISCompare.py", line 20, in
from dna_features_viewer import BiopythonTranslator
File "/Users/mathilde/opt/anaconda3/lib/python3.9/site-packages/dna_features_viewer/init.py", line 3, in
from .GraphicRecord import GraphicRecord
File "/Users/mathilde/opt/anaconda3/lib/python3.9/site-packages/dna_features_viewer/GraphicRecord/init.py", line 1, in
from .GraphicRecord import GraphicRecord
File "/Users/mathilde/opt/anaconda3/lib/python3.9/site-packages/dna_features_viewer/GraphicRecord/GraphicRecord.py", line 2, in
from ..biotools import find_narrowest_text_wrap
File "/Users/mathilde/opt/anaconda3/lib/python3.9/site-packages/dna_features_viewer/biotools.py", line 38, in
_aa1: _aa3[0] + _aa3[1:].lower() for (_aa1, _aa3) in zip(aa1 + "", aa3 + [""])
TypeError: can only concatenate tuple (not "str") to tuple
This error does not have anything to do with the new input file, nor with any input file (I tried to run it on the old ones as well).
It actually appears even when just trying :
python ISCompare.py -h
I tried downloading it from scratch and also reinstalling the python library dna_features_viewer but this did not solve it. Any idea of where the problem is?
Thank you very much,
Best regards,
Mathilde
Dear ISCompare Team,
I m encountering the following error:
Directory ISCompare_out created.
Query: ['/mnt/e/Working/Shigella/Data/shigella_flexneri/out_africa_flex_53/annotation/ERR126957/ERR126957.gbk'] Ref: ['/mnt/e/Working/Shigella/Data/shigella_flexneri/flex.gbk']
Copying files to ISCompare_out folder, and extracting fasta files from genbank.
query.fasta generated.
ref.fasta generated.
Retrieving IS sequences from ISFinder database.
Please cite: Siguier P. et al. (2006)
ISfinder: the reference centre for bacterial insertion sequences.
Nucleic Acids Res. 34: D32-D36
ISFinder database URL: http://www-is.biotoul.fr.
SSL: CERTIFICATE_VERIFY_FAILED. Do you want to continue without SSL certificate verification?
Please input Y/N: y
175 IS sequences downloaded from ISFinder database.
819 ISs found on the query genome...
Step1: Removing identical scaffolds: Query --> Reference ...
452 identical scaffolds removed from the analysis...
Step2: Testing for new IS on the query...
/home/sar/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py:7138: FutureWarning: Sorting because non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
sort=sort,
Traceback (most recent call last):
File "ISCompare.py", line 2177, in <module>
missingIS = testIS(rseq, qseq,MiddleMissingISID,consecutiveIS)
File "ISCompare.py", line 853, in testIS
extractSeqFromREFGB(refFileGB,keep,refISSurroundsFromQuery,surroundingLen,shift)
File "ISCompare.py", line 525, in extractSeqFromREFGB
coordStart = coordStart - shift
TypeError: unsupported operand type(s) for -: 'str' and 'int'
The command i ran is :
python ISCompare.py -q /mnt/e/Working/Shigella/Data/shigella_flexneri/out_africa_flex_53/annotation/ERR126957/ERR126957.gbk -r /mnt/e/Working/Shigella/Data/shigella_flexneri/flex.gbk -o ISCompare_out -c -I -p
Can you please help me to resolve the error
Thanks
SAR
Hello :)
Thank you for developing this software.
I experienced a python TypeError running ISCompare for the first time (see below).
This was replicated across Python versions 3.8, 3.9 and 3.10.
I think it is due to the contig/scaffold names being numbers (common in Unicycler assemblies), which are converted to integers.
Copying files to test folder, and extracting fasta files from genbank.
query.fasta generated.
ref.fasta generated.
24 ISs found on the query genome...
Step1: Removing identical scaffolds: Query --> Reference ...
0 identical scaffolds removed from the analysis...
Step2: Testing for new IS on the query...
0 ISs found on the reference genome...
The selected Insertion sequences were not found on the reference genome...
Step3: Removing identical scaffolds: Reference --> Query ...
0 identical scaffolds removed from the analysis...
Step4: Testing for IS missing on the query...
Step5: Consolidating results and annotating data...
Step6: Printing stats...
Traceback (most recent call last):
File "/Users/tom/Bioinformatics/ISCompare/ISCompare.py", line 2354, in <module>
f.write("\nQuery - Scaffolds with more than one ISs [Name]: " + ', '.join(statsQuery[2]))
TypeError: sequence item 0: expected str instance, int found
This can be fixed by adding statsQuery = [str(i) for i in statsQuery]
to line 2351.
Thanks,
Tom
` ********************************************
***** ISCompare
***** Version: 1.0.1
***** Developed by Mauricio J. Lozano
***** github.com/maurijlozano
Please cite:
Easy identification of insertion sequence mobilization events
in related bacterial strains with ISCompare.
E.G. Mogro, N. Ambrosis, M.J. Lozano
doi: https://doi.org/10.1101/2020.10.16.342287
Instituto de Biotecnología y Biología Molecular
CONICET - CCT La Plata - UNLP - FCE
Downloaded from: https://github.com/maurijlozano/ISCompare
Directory consensusMap already exists.
Query: ['MT-1_consensus_seqret.genbank'] Ref: ['MT-2_consensus_seqret.genbank']
Copying files to consensusMap folder, and extracting fasta files from genbank.
query.fasta generated.
ref.fasta generated.
6 ISs found on the query genome...
Step1: Removing identical scaffolds: Query --> Reference ...
0 identical scaffolds removed from the analysis...
Step2: Testing for new IS on the query...
6 ISs found on the reference genome...
Step3: Removing identical scaffolds: Reference --> Query ...
0 identical scaffolds removed from the analysis...
Step4: Testing for IS missing on the query...
Step5: Consolidating results and annotating data...
Traceback (most recent call last):
File "/mnt/SD2/Jyotirmoys/Backup_20211108/BIN21P012_MT/ISCompare/./ISCompare.py", line 2297, in
FinalResults[['ISstart', 'ISend', 'Start1', 'End1', 'REF.Start1','REF.End1','Start2', 'End2', 'REF.Start2','REF.End2']] = FinalResults[['ISstart', 'ISend', 'Start1', 'End1', 'REF.Start1','REF.End1','Start2', 'End2', 'REF.Start2','REF.End2']].astype("Float32")
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/generic.py", line 5808, in astype
results = [
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/generic.py", line 5809, in
self.iloc[:, i].astype(dtype, copy=copy)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/generic.py", line 5815, in astype
new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 418, in astype
return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 327, in apply
applied = getattr(b, f)(**kwargs)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 591, in astype
new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1309, in astype_array_safe
new_values = astype_array(values, dtype, copy=copy)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1257, in astype_array
values = astype_nansafe(values, dtype, copy=copy)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/dtypes/cast.py", line 1105, in astype_nansafe
return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/arrays/floating.py", line 261, in _from_sequence
values, mask = coerce_to_array(scalars, dtype=dtype, copy=copy)
File "/opt/sw/bioinfo-tools/sources/anaconda3/envs/multiqc/lib/python3.9/site-packages/pandas/core/arrays/floating.py", line 143, in coerce_to_array
raise TypeError(f"{values.dtype} cannot be converted to a FloatingDtype")
TypeError: object cannot be converted to a FloatingDtype
`
I solved this by changing the line# 2297 of ISCompare.py from .astype("Float32") to .astype("float32")
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.