Git Product home page Git Product logo

pewo's People

Contributors

blinard-bioinfo avatar erivals avatar frederic-mahe avatar matthiasblanke avatar nromashchenko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pewo's Issues

add BranchDistance computation to PAC procedure

Currently PEWO PAC procedure computes Node Distance and expected Node Distance.
Another measure (already used by some authors) would be to use Branch Distance, e.g. actual branch length separating expected and observed placement.

javac encoding issues

Hi!

just ran into an issue with javac (via conda) choosing US-ASCII as encoding, causing the build to fail:

[javac] /home/folder/PEWO/scripts/java/PEWO_java/lib/RAPPAS/src/inputs/FASTQPointer.java:84: error: unmappable character (0xA9) for encoding US-ASCII
[javac]                 //elimination des character sp??ciaux
[javac]                                                ^
[javac] 68 errors
[javac] 1 warning

This may be a very system setup dependent issue affecting pretty much noone. Regardless, I thought I'd post the issue and a quick fix anyway.

Fix:

export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8

and re-run the PEWO installer.

Pierre

Count is reserved for internal use

Under new versions of snakemake (e.g. mine is 7.24.2), the eval_accuracy workflow does not run with:

invalid name for input, output, wildcard, params or log: count is reserved for internal use
  File "/home/nikolai/dev/pewo_workflow/eval_accuracy.smk", line 27, in <module>
  File "/home/nikolai/dev/pewo_workflow/rules/op/operate_prunings.smk", line 72, in <module>

It does not happen with snakemake 5.10.0 (the version PEWO requires in the envs/environment.yaml and uses by default). Seemingly it happens as early as for 5.18.1. Just need to rename the parameter count to something else like pruning_count when updating dependencies of PEWO.

add documentation relative to CI tests

@nromashchenko commented in #4 :

Our Travis CI runs two pipelines travis/tests/1_..., travis/tests/2_... on every push, making sure it's possible to build and run those toy examples in the isolated environment.
If a developer does not change config files of those, his new app is not tested with CI.

What needs to be done:

  • add a section relative to CI in developer documentation
  • add a concrete example, this example could be directly based on AppSPAM example, e.g. commit 8a4af9d discussed in #4

Troubles with eval_resources_plots.R running the pipeline with single software

I just ran into an issue with eval_resources_plots.R for a run that only tests epa. Looks like the script presupposes that results for rappas and others must always be there; is there some easy way of fixing/hacking this to work? See warnings and erros below.

Cheers,
Pierre

[1] "OP:hmmer-align"
Warning message:
In analyses["epa"] <- c("hmmer-align", "epa-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h1"] <- c("hmmer-align", "epang-h1-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h2"] <- c("hmmer-align", "epang-h2-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h3"] <- c("hmmer-align", "epang-h3-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["epang_h4"] <- c("hmmer-align", "epang-h4-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["pplacer"] <- c("hmmer-align", "pplacer-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["apples"] <- c("hmmer-align", "apples-placement") :
  number of items to replace is not a multiple of replacement length
Warning message:
In analyses["rappas"] <- c("ansrec", "rappas-dbbuild", "rappas-placement") :
  number of items to replace is not a multiple of replacement length
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Calls: merge ... merge.default -> merge -> merge.data.frame -> fix.by
Execution halted 

define_resource_inputs copies input files

The define_resource_inputs rule copies input files to:

A/
T/
R/
G/

For examples, in case of eval_resources.smk, R/ and G/ will just have one copy of the same query file each. If the query file is big enough, this can be problematic. Since output files for rules can be symlinks, we should use them instead.

Add protein support for APPLES

For the latest version of APPLES (v2.0.5), PEWO should be using the -p flag for amino acid sequences. Currently it runs it in DNA mode what makes APPLES silently producing nonsense results.

Update wiki: examples 1-4 ND values

Due to recently fixed bug in the ND computation (and previous one leading to distances reported to be increased by one), we need to update the tutorials where they show the resulting NDs you get from running examples.

jscripts/java/PEWO_java/dist/PEWO.jar raises java.io.InvalidClassException

Hi,

The following job fails:

java -cp scripts/java/PEWO_java/dist/PEWO.jar DistanceGenerator_LITE2 /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run RAPPAS,EPANG,PPLACER &> /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run/logs/compute_nd.log

with the following exception:

"~/PEWO/examples/1_fast_test_of_accuracy_procedure/run/logs/compute_nd.log" 34L, 2389C 2,1 Top
ARGS: workDir [list_of_tested_software_directories,comma-separated]
example: /path/to/pewo_workdir EPANG,RAPPAS,PPLACER
scripts/java/PEWO_java/dist/PEWO.jar
workDir: /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run
Loading /home/balaban/PEWO/examples/1_fast_test_of_accuracy_procedure/run/expected_placements.bin
Loading NxIndex
Loading pruningIndex
Loading expected placements
Loading trees
Jan 14, 2021 8:45:06 AM DistanceGenerator_LITE2 main
SEVERE: null
java.io.InvalidClassException: javax.swing.JComponent; local class incompatible: stream classdesc serialVersionUID = 3742318830738515599, local class serialVersionUID = 4588530037560142483
at java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1903)
at java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1772)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2060)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1594)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at java.base/java.util.ArrayList.readObject(ArrayList.java:928)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1160)
at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2216)
at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2087)
at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1594)
at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:430)
at DistanceGenerator_LITE2.main(Unknown Source)

I tried reinstalling PEWO and rebuilding the jar file but it didn't resolve the issue.

Set up Github actions

We certainly need at least some level of testing and CI here, since the project is developed by more than zero people. We should start with:

  1. running workflows with all example data we have
  2. add an example for amino acid sequences

Conflict between local pip and conda pip for package taxtastic

If taxtastic is installed locally via pip, it mingles with the taxtastic installation of the conda environment.
Conda has its own internal pip command but somehow, it cannot isolate both ?

Example: taxtastic 0.8.5 installed locally via pip + 0.8.11 install via PEWO conda environement:

Traceback (most recent call last):
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 578, in _build_master
    ws.require(__requires__)
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 895, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 786, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.VersionConflict: (taxtastic 0.8.5 (/home/benclaff/.local/lib/python3.6/site-packages), Requirement.parse('taxtastic==0.8.11'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/benclaff/softwares/miniconda3/envs/PEWO/bin/taxit", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3112, in <module>
    @_call_aside
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3096, in _call_aside
    f(*args, **kwargs)
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3125, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 580, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 593, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/home/benclaff/.local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 781, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'taxtastic==0.8.11' distribution was not found and is required by the application

Get rid of psiblast2fasta

According to HMMER's 3.3.2 guide, it supports FASTA with --outformat AFA. There is no need for psiblast and format juggling. The main reason why this should be done is that psiblast2fasta is quite memory inefficient, and it makes alignment steps in all workflows unrealistically RAM-greedy.

INSTALL.sh cannot find conda installation

I have miniconda 4.6.10 installed in my machine. In this version, "conda" command does not execute a binary file directly, it is a function added to .bashrc during installation. INSTALL.sh checks whether conda is installed using the "command -v" command. this command cannot locate functions defined in bashrc. As a result, PEWO installation fails with the error message :
PEWO installer: Command 'conda' not found.
PEWO installer: This is a requirement to PEWO installation. See documentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.