josephryan / sowhat Goto Github PK

View Code? Open in Web Editor NEW

10.0 7.0 4.0 21.81 MB

Program to run the SOWH test (likelihood-based test used to compare tree topologies which are not specified a priori)

Home Page: http://sysbio.oxfordjournals.org/content/64/6/1048

License: GNU General Public License v3.0

Perl 78.34% Shell 19.78% Dockerfile 0.42% TeX 1.47%

sowhat's Issues

update documentation and usage subroutine to reflect RAxML default

Possibly include stopping criterion

Stopping criterion would be met after a minimum number of samples (say 100) and when the p-value and the confidence interval fall entirely on the same side of the significance level.

Check that constraint tree, dataset have consistent taxa names

The error that currently is output is a confusing message from RAxML.

Improve Partition Feature Functionality

Only simple partitioning schemes are currently allowed.
Improve feature to allow partitioning schemes that cover multiple non-contiguous regions of an alignment, that are listed out of order, or that divide by codon position.

sowhat should throw error if reps < 10

"The problem with these arguments is that you have set the maximum number of reps to 2 (--reps=2) - this means that sowhat will generate only two simulated datasets and this is not enough to calculate any statistics, such as a p-value. sowhat won't start printing a sowhat.results.txt file until after the 10th bootstrap replicate to avoid choking on any mathematical errors near the start of the analysis. We strongly recommend each analysis should use 100+ bootstrap replicates, and that the number of bootstraps (the sample size) should be justified by reporting the confidence interval surrounding the p-value"

Add tests with more intuitive messages for missing prerequisites

add documentation about using local::lib

Bootstrapping the package in and using that to cpan R

error in Statistics::R related to a flag sent to R (--gui=none).

My fix was to remove the command-line switch from the system call to R in 'Statistics/R/Bridge/Linux.pm'. We need to at least warn users. Probably should write the author(s) of Statistics::R. Not sure if this affects other versions.

escape spaces in pathnames returned from getcwd

this needs to be a system independent method using Perl module

include sowhat version in results.txt

perhaps also seq-gen version.

mislabeled constrained / unconstrained values in report

the _structure_subroutine needs to be edited so that the report is corrected, by switching the ml and t1 values.

Character dataset fails with newest RAxML

RAxML v 8.1.20

create docker image release

Add more version information to sowhat.results

Currently only the version of RAxML is recorded. It would be helpful to have the following versions recorded in the results:

sowhat version
seq-gen version
PhyloBayes version (if used)
Garli version (if used)

more obvious warning when not in sequential phylip

current warning is that taxa dont match up, difficult to sort out.

seq-gen has limited model set. sowhat needs to check models before any processes are run.

treetwo is better

This message "Constraint_X is more likely than Y, X will be used as the constraint tree instead" is only printed to STDERR. This information should also go in the results file. In general there should be more details about which tree is which in the results file.

Description of sample datasets

Need to add a description of sample datasets to README.md. These should explain the origins of the datasets, as well as list some basic attributes (number of taxa, number of genes, number of sites). There should also be an indication of which files pertain to which datasets.

Failing travis ci due to old R

The TravisCI test is failing: https://travis-ci.org/josephryan/sowhat/builds/221453597

The problem seems to be:

The test images is based on Ubuntu 14.04.5 LTS
sudo apt-get install -y r-base installs R 3.0.2
The ape package is unavailable for this older R, throwing the warning package 'ape' is not available (for R version 3.0.2)
Results in the error library(ape) : there is no package called 'ape'

We should figure out how to install a more recent version of R on this container or use a different test image that allows installation of newer R.

Allow for comparison of two suboptimal hypotheses

Right now limited to most likely topology vs. a single alternative.

update and clean github documentation

clean up and reprioritize documentation.

Installation not possible

Hi @caseywdunn,

I cannot install sowhat. Is it to outdated? I work on a cluster without admin privileges.

Any help is highly appreciated. Here is the error.log.

Cheers Bastian

Add support for AUTO model specification in RAxML

run an initial tree with AUTO specified and then after parsing the info file, use the determined best model for all downstream analyses.

Add true support for multistate

Will require running seq-gen in aminoacid mode and up to 20 substitutions. multistate currently fails with: "expecting 2 frequencies. Multi-State only works w/binary matrix".

Eat MOAR Twizzlers

Self evident.

add subroutine tests

see: http://search.cpan.org/~oliver/Test-Subroutines-1.113350/lib/Test/Subroutines.pm

raxml returns higher likelihood for constrained tree than unconstrained tree for some simulated matrices

This is a known issue, and is addressed here:

https://groups.google.com/forum/#!topic/raxml/qn7_ZXoJTHg

We should redo some of hte problematic searches with the addition of the --no-bfgs raxml option. If that fixes the problem for those matrices, we should rerun the tests in the manuscript that were impacted by this. If they work, we should revise the manuscript to reflect the fix.

instructions for monitoring a job - and cutting a job short

SOWH tests can take a long time on large datasets. Here are some ideas on how to monitor a job and how to cut a job short. These should be considered (and tested) for being added to the documentation:

Monitoring a job
the following command can be used to monitor a job (only if reps = 1 - the default) if run from within the directory that was specified with the --dir option:
```
ls -1 sowhat_scratch/ | cut -f 4 -d '.' | sort -n | tail -n1
```
Cutting a job short
If the job is currently running, make a copy of the directory that was specified with the --dir option (and its contents) :
```
cp -R myoutputdir rerunoutputdir
```
run the exact same command as before, but with this new directory as the --dir option, and
```
--reps=SMALLER_NUMBER_OF_REPS --restart
```

josephryan / sowhat Goto Github PK

sowhat's Issues

Recommend Projects

Recommend Topics

Recommend Org