alexga / phylostratigraphy Goto Github PK
View Code? Open in Web Editor NEWPipeline for Phylostratigraphy
License: Apache License 2.0
Pipeline for Phylostratigraphy
License: Apache License 2.0
Hi Alex,
Thanks for your work. This issue is not related to any bug or problems.
I just want to check with you how to cite this work in our upcoming manuscript.
I don't want to just simply say that we used a perl script to calculate the Phylostrata value. You deserve better.
Looking forward to your reply.
Hi,
I just wanted to ask what the numbers representing the phylostrata in the output file are corresponding to. I am trying to determine the phylostratum for each gene from a bacterium from that same database and its [taxonomy;string;delimited;with;semicolons;for;each;level]
contains 10 different levels, i.e. Bacteria, Actinobacteria, etc.
As an output I get phylostrata 0-11 however, implying 12 different levels. How can that be when my organism of interest started with only 10 different levels of taxonomical classification?
I am wondering which level from the taxonomy string the phylostratum 0, 1 and 11 correspond to. Is 0 the first or last category in the taxonomy string?
Hi there,
Thanks for this scripts. I am adapting it to my own species (brachypodium) and got the error like this:
perl createPSmap.pl --organism /global/cscratch1/sd/llei2019/B_syl_pro/query_fasta/query_test.fasta --database /global/cscratch1/sd/llei2019/ncbi_NR_databases/rehead_nr_20201202.fa --prefix BS_BlastAll_PS_map --seqOffset 50 --evalue 1e-5 --threads 60 --blastPlus
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at xmlParser.SeqIdentifier.convertHeadertoSeqIdentifier(SeqIdentifier.java:84)
at xmlParser.ParseXMLtoPS.createPSmap(ParseXMLtoPS.java:125)
at xmlParser.ParseXMLtoPS.(ParseXMLtoPS.java:38)
at xmlParser.CreatePSmap.main(CreatePSmap.java:95)
... 5 more
Removing BS_BlastAll_PS_map_query_test_1_50.xml after compressing to BS_BlastAll_PS_map_query_test_1_50.xml.tbz
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at xmlParser.SeqIdentifier.convertHeadertoSeqIdentifier(SeqIdentifier.java:84)
at xmlParser.ParseXMLtoPS.createPSmap(ParseXMLtoPS.java:125)
at xmlParser.ParseXMLtoPS.(ParseXMLtoPS.java:38)
at xmlParser.CreatePSmap.main(CreatePSmap.java:95)
... 5 more
Removing BS_BlastAll_PS_map_query_test_51_100.xml after compressing to BS_BlastAll_PS_map_query_test_51_100.xml.tbz
It seems like something wrong with the "ParseXMLtoPS.jar". But I could not figure it out. Any ideas about it?
Hi,
Alexander Gabel ,
It is a great honor to use the pipeline for phylostratigraphy that you shared.
However, I have recently had just one problem using your shared pipeline for phylostratigraphy.
In fact, no errors were reported during the process , but my output file contains only one line of results.
I started with proteome of Mycobacterium tuberculosis, which I focused on, but made the error I described above.
Then I used "Acaryochloris Marina MBIC11017" as an example, but the same problem still existed.
I guess that only the last processed protein seems to be recorded in the output file.
This the header of my FASTA-file of the organism :
NP_214515.1 | [Mycobacterium tuberculosis H37Rv] | [Bacteria; Actinobacteria; Actinomycetia; Corynebacteriales; Mycobacteriaceae; Mycobacterium; Mycobacterium tuberculosis]
WP_009556083.1 | [Acaryochloris marina] | [Bacteria; Cyanobacteria; Oscillatoriophycideae; Chroococcales; Acaryochloris]
This is the command:
perl createPSmap.pl --organism /home/data/t010208/Chengtao/Phylostratigraphic_analysis/rowdata/Acaryochloris_marina_MBIC11017.fasta --database /home/data/t010208/Chengtao/Phylostratigraphic_analysis/phyloBlastDB/phyloBlastDB.fa --prefix phyloBlastDB.fa --seqOffset 50 --evalue 1e-5 --threads 96 --blastPlus
This is the output file:
PS;GeneID
1;NP_214523.1
#There is just one line of results,and there doesn't seem to be anything special about this protein, except that it's the last protein in my Fasta-file
The script files are all up to date, the last modification date is 27 Jan 2021.
Your reply is greatly appreciated!
Kind regards,
Cheng Tao
Hi Alex,
Thanks for this excellent tool.
I want to calculate the gene age of axolotl proteins and use this script now. But errors occurred as follows.
I noticed that all blast xml files have been generated but only the 'map_BLAST_PS_tables' file is empty.
Then I seperately ran the java like this, an error still happens.
Following your advices in #2 issue, I checked the hit_def info in my xml file, it seems right?
I am very confused with this error. Could you help me? Thanks for your time and work.
Pan
Hi there,
I am getting the following error trying to run the program:
Type of arg 1 to keys must be hash (not private variable) at ../Phylostratigraphy/createPSmap.pl line 196, near "$seqHash;"
Execution of ../Phylostratigraphy/createPSmap.pl aborted due to compilation errors.
```.
I thought it might be the headers of my file. I have:
```>ANAN_ju_g16.t1 | [Arobeloides nanus] | [Eukaryota; Opisthokonta; Metazoa; Eumetazoa; Bilateria; Protostomia; Ecdysozoa; Nematoda; Chromadorea; Rhabditida; Cephaloboidea; Cephalobidae; Acrobeloides]
SKLVEFGDTIFIALRKRPLTFLHCYHHCSVLIYTFHSGAEHLASGRWFMWMNFIAHSVMYTYFCAVSAGIKVPRKLAKCVTLIQITQMILGIGVSLSVFA
IKSLTSWRCHQSYTNLYLSFFIYVSYAILFIRFFINAYSPNKKVIESDKQK
```.
But I am not sure. Would be great if you could look into this.
Cheers
Philipp
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.