phylo42 / pewo Goto Github PK
View Code? Open in Web Editor NEWPhylogenetic Placement Evaluation Workflows : Benchmark placement software and different reference trees
License: MIT License
Phylogenetic Placement Evaluation Workflows : Benchmark placement software and different reference trees
License: MIT License
Currently PEWO PAC procedure computes Node Distance and expected Node Distance.
Another measure (already used by some authors) would be to use Branch Distance, e.g. actual branch length separating expected and observed placement.
Hi!
just ran into an issue with javac (via conda) choosing US-ASCII as encoding, causing the build to fail:
[javac] /home/folder/PEWO/scripts/java/PEWO_java/lib/RAPPAS/src/inputs/FASTQPointer.java:84: error: unmappable character (0xA9) for encoding US-ASCII
[javac] //elimination des character sp??ciaux
[javac] ^
[javac] 68 errors
[javac] 1 warning
This may be a very system setup dependent issue affecting pretty much noone. Regardless, I thought I'd post the issue and a quick fix anyway.
Fix:
export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
and re-run the PEWO installer.
Pierre
Under new versions of snakemake (e.g. mine is 7.24.2), the eval_accuracy workflow does not run with:
invalid name for input, output, wildcard, params or log: count is reserved for internal use
File "/home/nikolai/dev/pewo_workflow/eval_accuracy.smk", line 27, in <module>
File "/home/nikolai/dev/pewo_workflow/rules/op/operate_prunings.smk", line 72, in <module>
It does not happen with snakemake 5.10.0 (the version PEWO requires in the envs/environment.yaml
and uses by default). Seemingly it happens as early as for 5.18.1. Just need to rename the parameter count
to something else like pruning_count
when updating dependencies of PEWO.
@nromashchenko commented in #4 :
Our Travis CI runs two pipelines travis/tests/1_..., travis/tests/2_... on every push, making sure it's possible to build and run those toy examples in the isolated environment.
If a developer does not change config files of those, his new app is not tested with CI.
What needs to be done:
I just ran into an issue with eval_resources_plots.R for a run that only tests epa. Looks like the script presupposes that results for rappas and others must always be there; is there some easy way of fixing/hacking this to work? See warnings and erros below.
Cheers,
Pierre
[1] "OP:hmmer-align"
Warning message:
In analyses["epa"] <- c("hmmer-align", "epa-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h1"] <- c("hmmer-align", "epang-h1-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h2"] <- c("hmmer-align", "epang-h2-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h3"] <- c("hmmer-align", "epang-h3-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h4"] <- c("hmmer-align", "epang-h4-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["pplacer"] <- c("hmmer-align", "pplacer-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["apples"] <- c("hmmer-align", "apples-placement") :
number of items to replace is not a multiple of replacement length
Warning message:
In analyses["rappas"] <- c("ansrec", "rappas-dbbuild", "rappas-placement") :
number of items to replace is not a multiple of replacement length
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Calls: merge ... merge.default -> merge -> merge.data.frame -> fix.by
Execution halted
The define_resource_inputs
rule copies input files to:
A/
T/
R/
G/
For examples, in case of eval_resources.smk
, R/
and G/
will just have one copy of the same query file each. If the query file is big enough, this can be problematic. Since output files for rules can be symlinks, we should use them instead.
For the latest version of APPLES (v2.0.5), PEWO should be using the -p flag for amino acid sequences. Currently it runs it in DNA mode what makes APPLES silently producing nonsense results.
Due to recently fixed bug in the ND computation (and previous one leading to distances reported to be increased by one), we need to update the tutorials where they show the resulting NDs you get from running examples.
Hi,
The following job fails:
java -cp scripts/java/PEWO_java/dist/PEWO.jar DistanceGenerator_LITE2 /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run RAPPAS,EPANG,PPLACER &> /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run/logs/compute_nd.log
with the following exception:
"~/PEWO/examples/1_fast_test_of_accuracy_procedure/run/logs/compute_nd.log" 34L, 2389C 2,1 Top
ARGS: workDir [list_of_tested_software_directories,comma-separated]
example: /path/to/pewo_workdir EPANG,RAPPAS,PPLACER
scripts/java/PEWO_java/dist/PEWO.jar
workDir: /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run
Loading /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run/expected_placements.bin
Loading NxIndex
Loading pruningIndex
Loading expected placements
Loading trees
Jan 14, 2021 8:45:06 AM DistanceGenerator_LITE2 main
SEVERE: null
java.io.InvalidClassException: javax.swing.JComponent; local class incompatible: stream classdesc serialVersionUID = 3742318830738515599, local class serialVersionUID = 4588530037560142483
at java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1594)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at java.base/java.util.ArrayList.readObject(ArrayList.java:928)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2216)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2087)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1594)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at DistanceGenerator_LITE2.main(Unknown Source)
I tried reinstalling PEWO and rebuilding the jar file but it didn't resolve the issue.
Need to check outputs set by workflow when rappas is used.
Previous version was using rappas_db_in_ram for accuracy and seperated db_build and placement for resources (the latter mode being slower).
We certainly need at least some level of testing and CI here, since the project is developed by more than zero people. We should start with:
If taxtastic is installed locally via pip, it mingles with the taxtastic installation of the conda environment.
Conda has its own internal pip command but somehow, it cannot isolate both ?
Example: taxtastic 0.8.5 installed locally via pip + 0.8.11 install via PEWO conda environement:
Traceback (most recent call last):
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 578, in _build_master
ws.require(__requires__)
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 895, in require
needed = self.resolve(parse_requirements(requirements))
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 786, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.VersionConflict: (taxtastic 0.8.5 (/home/benclaff/.local/lib/python3.6/site-packages), Requirement.parse('taxtastic==0.8.11'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/benclaff/softwares/miniconda3/envs/PEWO/bin/taxit", line 6, in <module>
from pkg_resources import load_entry_point
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3112, in <module>
@_call_aside
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3096, in _call_aside
f(*args, **kwargs)
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3125, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 580, in _build_master
return cls._build_from_requirements(__requires__)
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 593, in _build_from_requirements
dists = ws.resolve(reqs, Environment())
File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 781, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'taxtastic==0.8.11' distribution was not found and is required by the application
According to HMMER's 3.3.2 guide, it supports FASTA with --outformat AFA
. There is no need for psiblast and format juggling. The main reason why this should be done is that psiblast2fasta is quite memory inefficient, and it makes alignment steps in all workflows unrealistically RAM-greedy.
I have miniconda 4.6.10 installed in my machine. In this version, "conda" command does not execute a binary file directly, it is a function added to .bashrc during installation. INSTALL.sh checks whether conda is installed using the "command -v" command. this command cannot locate functions defined in bashrc. As a result, PEWO installation fails with the error message :
PEWO installer: Command 'conda' not found.
PEWO installer: This is a requirement to PEWO installation. See documentation.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.