Comments (10)
The taxonomy string is not formatted the way it is expecting. Compare the taxonomy columns between your two results.
from amptk.
Hi thanks for your time, I compared the two files, but I really don't know where is the problem.
I'm attaching you an extract of the taxonomy column for both.
Taxonomy for the file that does't work:
Taxonomy
GS|98.9|JX036083|SH1553429.08FU;k:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Lecanorales,f:Catillariaceae,g:Austrolecia
GS|98.8|JQ689430|NA;k:Fungi,p:Ascomycota,c:Sordariomycetes
GS|99.1|KJ576704|NA;k:Viridiplantae,p:Chlorophyta,c:Trebouxiophyceae,o:Trebouxiales,f:Trebouxiaceae,g:Trebouxia
GS|99.4|JX036043|SH1176515.08FU;k:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Caliciales,f:Caliciaceae,g:Buellia
GS|98.9|JQ689440|NA;k:Fungi,p:Ascomycota,c:Sordariomycetes
GSL|100.0|AF353997|SH1646182.08FU;k:Fungi,p:Ascomycota,c:Sordariomycetes
US|0.8882|JX848589|NA;k:Fungi,p:Glomeromycota,c:Glomeromycetes,o:Diversisporales
GS|98.0|KY608888|NA;k:Fungi,p:Ascomycota,c:Dothideomycetes,o:Botryosphaeriales,f:Phyllostictaceae,g:Guignardia,s:Guignardia laricina
GS|100.0|MH660418|SH1566856.08FU;k:Fungi,p:Chytridiomycota,c:Spizellomycetes,o:Spizellomycetales,f:Powellomycetaceae,g:Powellomyces,s:Powellomyces hirtus
GS|99.5|KJ599551|NA;k:Viridiplantae,p:Chlorophyta,c:Trebouxiophyceae,o:Trebouxiales,f:Trebouxiaceae,g:Trebouxia
US|0.9426|AB219231|NA;k:Fungi,p:Ascomycota,c:Sordariomycetes
Taxonomy for the file that works:
Taxonomy
SS|0.8400|None;d:Bacteria
SS|0.8900|None;d:Bacteria
SS|1.0000|DQ422812_S001020357;d:Bacteria,p:Cyanobacteria/Chloroplast
SS|0.9300|None;d:Bacteria
GS|98.4|KF245634_S003920711;d:Bacteria,p:"Acidobacteria",c:Acidobacteria_Gp4,o:Aridibacter;
SS|1.0000|KC560021_S003719248;d:Bacteria,p:"Bacteroidetes",c:Sphingobacteriia,o:"Sphingobacteriales",f:Chitinophagaceae
SS|0.9500|KF459924_S004053537;d:Bacteria,p:"Actinobacteria",c:Actinobacteria,o:Rubrobacteridae,f:Solirubrobacterales
SS|1.0000|None;d:Bacteria
from amptk.
The error says there is a taxonomy string that doesn’t have a colon separating the taxonomy level from the name. Looks like might be a single or a few strings that are causing the problem. Are there any that are empty?
from amptk.
Hi,
No one raw is empty, I also checked if same raw of column taxonomy dosen't have ":" necessary for the w.split(":"), but al raws have the ":"
from amptk.
Ok, If I do :
a=pd.read_csv(path+"miseq.otu_table.taxonomy.txt",sep="\t")
for i in a.Taxonomy:
print(i.split(":")[0])
print(i.split(":")[1])
I obtain the same error, I'm trying to fix the wrong raw ! But why these happened ? if I performed just the classic commands ?
from amptk.
Ok I fixed the problem adding ":Unknown" in the raw GDL|100.0|KY932470|NA;.
Now summarize works !
from amptk.
Is this the database packaged with AMPtk or one that you made? The taxonomy string in the fasta file used to create the database is the underlying problem.
from amptk.
Yes it is, I used the database packaged with AMPTK.
from amptk.
Okay. Let’s leave this open so I remember to fix the underlying issue as well as patch the code in summarize to not error out but alert user of which taxonomy strings are empty.
from amptk.
fixed in v1.4.1
from amptk.
Related Issues (20)
- Issue installing AMPtk (Mac OS - M1 chip) HOT 2
- getting NoneType vs int error in clustering step
- Error when run quick start HOT 7
- usearch9 not found when generate UTAX database
- VSEARCH error on amptk -filter step
- Support Python 3.8 onwards HOT 3
- SyntaxError in "duplicate ID in mapping file: XXX, exiting"
- Default for -p, --index_bleed documented as 0.005 HOT 1
- Typo "Bjerkandara adusta" --> "Bjerkandera adusta" HOT 1
- Missing species names in amptk_mock1.fa HOT 3
- Missing final new line in amptk_mock1.fa and amptk_synmock.fa HOT 2
- Inconsistent primer trimming sequence in amptk_mock*.fa HOT 5
- Matching MockA, MockB1 and MockB2 to FASTQ filenames HOT 2
- platform.linux_distribution is removed since Python 3.8 HOT 1
- Species names in amptk_mock2.fa and amptk_mock3.fa vs Figure 4
- new users cannot install amptk properly, please help HOT 3
- unoise3 clustering HOT 5
- Problem with TypeError during AMPtk cluster HOT 11
- Saw you started some prelim ONT methods HOT 2
- Problematic unoise3 implementation with VSEARCH HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amptk.